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Progress in Chemotherapy of Leishmaniasis* 



Vishnu Ji Ram* and Mahendra Nath 



Medicinal Chemistry Division, Central Drug Research Institute, Lucknow-226001, 
India 



Abstract: This reveiw gives an overall view on the development of 
cheemotherapy of leishmaniasis and current state of knowledge of the 
disease processes. 




Introduction 

Leishmaniasis is one of the most devastating 
complex of diseases and a major health problem of 
tropical, subtropical and mediterranean regions [1-8]. It 
occurs in all continents except Australia [9]. About 350 
million people all over the world are at risk of acquiring 
leishmaniasis and it has been estimated that 12 million 
new cases of leishmaniasis occur each year [10]. 

Although, visceral leishmaniasis is widely distributed 
throughout the tropics but it was violent in Indian 
subcontinent and South West Asia [11]. In India 
visceral leishmaniasis (Kala-azar) is prevalent which is 
caused by L. donovani. Visceral leishmaniasis is found 
in Bihar, West Bengal and eastern parts of Uttar 
Pradesh. Kala-azar is more lethal and produces upto 
98% of mortality in untreated cases. This is one of the 
major communicable diseases of the tropics causing 
threat to the entire country's population. In India alone 
about 3.0 million people are infected with leishmania 
and atleast 1000 death per annum is estimated. 

Leishmaniasis is a parasitic disease caused by the 
invasion of intracellular parasite known as leishmania on 
reticulo-endothelial system of the host [12,13]. These 
unicellular parasites are transmitted from host to host by 
the bite of a vector sandfly [14]. 

Life Cycle and Morphology 

The leishmania is a protozoan, which has a 
heterogeneous life cycle with a regular alteration 
between the Phlebotomine vector and the vertebrate. 
The oscillatory life cycle of parasites is repeatedly 
transformed from a non-motile, intracellular form of a 
vertebrate macrophage to a motile, extracellular form of 
the lumen of an insect gut. 

*CDRI Communication No. 5521 



The leishmania parasite exists in two morphological 
forms, known as promastigote and amastigote. The 
promastigotes are - 20 urn long and 1 .5-3.0 urn broad 
with a single long flagellum and multiply by binary 
fission as an extracellular parasite in the gut lumen of 
female sandfly. 

The amastigotes are 2-5 urn long, intracellular, non- 
motile, uninucleate ovoid organism containing a rod 
shaped kinetoplast associated with a flagellar rudiment 
and multiply repeatedly by binary fission, eventually 
destroying macrophages of vertibrate host. When an 
amastigote is ingested by a Phlebotomine sandfly, it 
elongates in the fly's gut and transforms into a 
flagellated promastigote or leptomonad. 

The disease is transmitted in the vertebrate host 
during bite by infected invertebrate via the transfer of 
promastigotes. These promastigotes are rapidly taken 
up by phagocytic cells of reticulo-endothelial system of 
the host, where they are transformed into amastigotes, 
circulated into the blood and multiply by asexual 
reproduction through longitudinal binary fission inside 
the macrophages of the liver, spleen, bone-marrow, 
mucosa and reticulo-endothelial system. 



Classification of Leishmaniasis 

The leishmaniasis is classified on the basis of 
symptomatology as follows: 

(0 Cutaneous leishmaniasis 

(ii) Visceral leishmaniasis 

(iii) Mucosal or mucocutaneous leishmaniasis 

(iv) Diffused cutaneous leishmaniasis 
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Cutaneous Leishmaniasis 

It is further classified as: 

(a) Old world cutaneous leishmaniasis 

(b) New world cutaneous leishmaniasis 

The old world cutaneous leishmaniasis is caused by 
L. tropica and L. major whereas new world cutaneous 
leishmaniasis is produced by L. braziliensis and L. 
maxicana. The clinical manifestations of both forms are 
similar while the lesions due to latter are more severe 
and chronic. 

Visceral Leishmaniasis (Kala-azar) 

It is caused by the L. donovani donovani, L. 
donovani infantum and L. donovani chagasi. It is the 
most devastating form among a complex of 
leishmaniases. The common symptoms of visceral 
leishmaniasis are fever, weight loss, anorexia, 
discomfort, changes in hair color, enlargement and 
marked alterations in function of the liver, spleen, bone 
marrow and lymph nodes. 

Mucocutaneous Leishmaniasis 

This form of disease is produced by L. braziliensis 
braziliensis, L. braziliensis panamensis and L. 
braziliensis guayanensis. The mucosal lesions are more 
frequent in South America and are quite similar to other 
type of cutaneous leishmaniasis. The symptoms are 
ulceration, nasal blockage, swelling of nose and lips 
with distruction of the soft tissue of the oronasal cavity. 

Diffused Cutaneous Leishmaniasis 

It is caused by various species of leishmania and 
characterized by dissemination of skin, thickening in 
plaques and multiple nodules. 

Diagnosis 

Leishmaniases are diagnosed by tissue biopsies of 
skin for cutaneous leishmaniasis and of bone marrow, 
spleen and lymph nodes for visceral leishmaniasis. 
Serological tests are also in use for supportive 
diagnosis. The sensitive immunofluorescent antibody 
test (IFAT) and counter-current electrophoresis 
technique are not suitable for field uses. The other 
techniques are using monoclonal antibodies, enzyme- 
linked immunosorbent assay (ELISA) systems, indium 
based slide precipitation and DNA probes. 



Ram and Nath 

Current Status of Chemotherapy of 
Leishmaniasis 

The chemotherapy of leishmaniasis depends on a 
small number of synthetic drugs which are not only 
toxic to parasite but also to the host. Infact, currently 
available antileishmanial drugs were not designed and 
synthesized for the leishmaniasis but they had been 
primarily developed to treat other protozoal infections 
such as malaria and trypanosomiasis. Lack of concerted 
effort to develop leishmanicidal agents failed to meet 
the expectations of patients suffering from 
leishmaniasis and thus it becomes imperative to take up 
the challenge and discover compounds with high 
efficacy and least toxicity. This review gives an overall 
view on the development of chemotherapy of 
leishmania. 



The Antimonials 

Though pentavalent antimonials have dire side 
effects but still they are in use for the treatment of 
human leishmaniasis. The important antimonial drugs 
are tartar emetic (1) [15,16], urea stibamine (2) [17-21], 
sodium stibogluconate (3) [22-24], meglumine 
antimoniate (4) [25-27], stibamine (5) [28], stibacetin 
(6) [29-31], stibophen and neostibosan or 
ethylstibamine. 



Tartar Emetic (Antimony Potassium Tartrate) 

This was the first effective antileishmanial drug, 
used for treating mucocutaneous leishmaniasis [32] in 
Brazil in 1912. In India this drug [33] also saved 
countless lives during epidemic culminated at the end 
of the last century and in 1918, against visceral 
leishmaniasis. This drug has also been extensively 
used against Mediterranean Kala-azar caused by L. 
infantum and L. tropica. In spite of its toxicity it is 
nicknamed as a "miracle drug" by the inhabitants of 
India and it was soon replaced by the better tolerated 
sodium antimony tartrate. 

H H 

v c c c 

I I I I 

_ O v O x ,0 _ + 

O— SbC ,Sb— O 2K- 3H 2 0 

0 0 0 0 

1 I I I 
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H H 
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Urea Stibamine 

Early trials with this pentavalent antimonial gave 
good results against Kala-azar and cured thousands of 
Kala-azar cases. Between 1933 to 1936 the drug 
saved nearly 3.25 lakh lives in Assam alone. 
Unfortunately, doubts were raised regarding its 
chemical nature, stability, toxicity and efficacy as an 
antileishmanial agent. This drug is often used in 
combination with Neostibosan. 





nh— CO — I 

2 

Sodium Stibogluconate (Pentostam) 

It is the safer and potent antileishmanial agent 
amongst pentavalent antimonials. It was developed by 
the Wellcome Foundation [22,23] and patented in 
1925. It has been used more widely to treat visceral 
leishmaniasis. This drug is better tolerated by patients 
and can be given intravenously or intramuscularly. The 
recommended adult dose is 20 mg/kg given daily for 20 
days. Mild muscular and joint pains are noted as its side 
effects. 



CH 2 OH CH 2 OH 

I I 

CHOH CHOH 
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Meglumine Antimoniate (Glucantime) 

Another well-tolerated pentavalent antimonial, 
glucantime, has been found to be effective against 
most forms of leishmaniasis and widely used in Latin 
America. The adult dose of the drug is 17-28 mg/kg per 
day and given intramascularly or intravenously for 10-20 
days. This drug has been also used against canine 
Kala-azar in France. The considerable side effects and 
toxic nature of antimonials prompted the investigation 
of a variety of other antimonials such as stibamine, 
stibacetin, stibophen and neostibosan as 
leishmanicides. Out of these, sodium stibogluconate 



(3) and glucantime (4) are still in clinical use as drugs for 
the treatment of visceral leishmaniasis. 



© 
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Demerits of Antimonials 

The demerit associated with these antimonials is 
that their efficacy differ from batch to batch depending 
upon the quantitative presence of Sb 5+ in the complex. 
The exact structure of these antimonials are unknown 
but in vivo [34] inactivity is due to presence of reduced 
oxidation state of Sb 3+ . The mode of action of 
pentostam is believed through the inhibition of certain 
enzymes while mechanism of action of glucantime is 
uncertain. These antimonials suffer from side effects 
such as vomiting, diarrhoea, abdominal pain, anorexia, 
malaise, weakness, muscle pain, liver and renal 
damage, shock, itching, fever, dizziness, heart burn 
and headache [35-38]. 



H(OH) 2 -SbzO 




NH— R 

5: R = H 

6: R = COCH 3 

Aromatic Diamidines 

Search for an alternative to antimony chemotherapy, 
culminated the discovery of some aromatic diamidines 
such as stilbamidine (7) [39,40], 2-hydroxystilbamidine 
(8) [41,42], pentamidine (9) [42,43], 4,4'- 
diamidinodiazoaminobenzene (10) [44] and (4- 
amidinophenoxy)benzal(4-amidinophenyl)hydrazine 
dihydrochloride (11) [45] as leishmanicides. 
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Initially, stilbamidine (7) and 2-hydroxystilbamidine 
(8) were developed as trypanocides [39] but were also 
tried in cases of Kala-azar [40]. However, in 
subsequent trials, these exhibited several toxic side 
effects, such as low blood pressure and respiratory 
distress. 

Further studies with aromatic diamidines led to the 
discovery of pentamidine (9). It is a second line of drug 
for the treatment of visceral leishmaniasis in patients 
who failed to respond to antimony therapy [46,47]. 
Pentamidine [48] is less toxic and more effective than 
stilbamidine against visceral leishmaniasis at 2-4 mg/kg 
dose for 15 days. The drug has no neurotoxicity but 
causes disturbances in sugar metabolism and kidney 
function. The frequent side effects are gastrointestinal 
disturbances, vomiting, pain at injection site, 
hypoglycemia and sudden fall in blood pressure 
(hypotension). It can also give rise to cardiotoxicity, 
diabetes mellitus, shock and hypocalcemia. 

4,4'-Diamidinodiazoaminobenzene (10) known as 
Berenil is another drug, which is primarily synthesized 
as a trypanocide and showed antileishmanial activity 
against L. donovani in hamsters [44]. Besides these 
aromatic diamidines, (4-amidinophenoxy)-benzal(4- 
amidinophenyl)hydrazine dihydrochloride (11) is 
another antiprotozoal compound which has shown 
high order of leishmanicidal activity against L. donovani 
in hamster model [45]. However, its further 
development as an antileishmanial agent was 
hampered by its high order of toxicity but it can serve as 
a good standard in leishmanicidal screenings. 



Mode of Action of Aromatic Diamidines 

The aromatic diamidines interact with protozoan 
kinetoplast DNA [49] and disorganise it. They were also 
found to disrupt the mitochondria of both the 
promastigotes and amastigotes, followed by the 
disruption and condensation of the kinetoplast DNA 
core. The aromatic diamidines could inhibit the RNA 
polymerase [50] and other enzymatic activities such as 
ribosomal activity [51] and synthesis of nucleic acids, 
proteins and phospholipids [52]. These drugs have 
been shown to block polyamine synthesis [53] needed 
for the biosynthesis of DNA and protein, by leishmania 
parasite. 



Oxygen Heterocycles 



Amphotericin B (12) 

Amphotericin B (12) is the second line of drug for 
the treatment of visceral leishmaniasis. It was also found 
to inhibit cultures of L. donovani and L. braziliensis 
[54]. It is an efficacious drug for treatment of 
mucocutaneous leishmaniasis in South America [55]. It 
produces excellent response even in cases resistant to 
the antimonials and diamidines [56-59]. According to 
Murray er a/., this drug exerts potent leishmanicidal 
activities [60,61] at a dose of 5-10 mg given daily until a 
dose level of 0.5-1.0 mg/kg is reached. 

Amphotericin B is quite toxic, therefore, it should be 
used under strict medical supervision. It is nephrotoxic, 
causing a fall in renal blood flow and glomerular filtration 
rates, enhancement in potassium excretion and 
increase in urine pH. The drug is also known to be 
cardiotoxic and gives rise to nausea, vomiting, fever, 
chills, anaemia and anorexia. 



Nystatin (13) 

Another compound, Nystatin (13), a fungicidal 
antibiotic, is closely related to amphotericin B and is 



OH 




12: A-B = C = C (Amphotericin B) 
13: A-B = CH - CH (Nystatin) 



COOH 
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found to be useful in the treatment of Kala-azar [62] 
and Post Kala-azar-dermal leishmaniasis [63]. The 
toxicity of the drug is nearly similar to amphotericin B. 

Rifampicin (14) 

The drug, rifampicin, an antitubercular antibiotic has 
also been used for treating cutaneous leishmaniasis 
in high doses [64-66] orally. 



CHg CH< 




14 



Nifurtimox (15) 

Several 2-substituted-5-nitrofurans have been 
tested for treating leishmaniasis of which nifurtimox 
(15) was found active in the treatment of cutaneous 
lesions and mucocutaneous ulcer caused by L. 
braziliensis [67]. 




15 h 3 c 



Paromomycin (16): 

It is an aminoglycoside antibiotic and effective for 
the treatment of Oriental sore [68]. 




NH 2 16 



Phaseolinone (17) 

It is a phytotoxin isolated from cultures of plant 
pathogen Macrophomina phaseolina and has been 
evaluated as new anti-leishmanial agent [69-72]. In 
preliminary trial, leishmania-infected hamsters on 
treatment with 20 mg/kg body weight for 10 days 
showed considerable decrease in level of parasitaemia 
[72,73]. Its chronic administration resulted into weight 
loss in weaning mice, spleen enlargement and 
temporary development of anaemia. However, upon 
withdrawal of the drug, conditions returned to normal 
[72,73]. 




2-Furyl-1,3-dioxanes (18) 

Various 5-substituted-2-furyl-1 ,3-dioxanes (18) 
have been reported as antileishmanial agents and are 
found useful in controlling L. tropica [74]. 

O— A-R, 

18 R 2 

R = N0 2 . Br, H, CH 3 

R, = C2H 5 , CH 2 CI 

R 2 = N0 2 , CH 2 OH, CH 2 CI 

1 ,2,4-Trioxanes 

Recently, several 1 ,2,4-trioxanes (19,20) 
evaluated as antiparasitic agents are reported to have 
high order of antileishmanial activity [75,76]. 




19 



R,R 2 = (CH 2 ) 4 (CH 2 )2SCH 2 
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20 

R,,R 2 = H, Alky!, Aryl 



Mycophenolic Acid (21) 

The antileishmanial compound, mycophenolic acid 
(21), inhibits the conversion of inosine 
monophosphate to guanosine monophosphate and 
synthesis of GTP. It was found to eliminate 50% of L. 
major from human macrophages in vitro [77]. 




oh o 
I II 

CH = C- CH 2 - CH 2 — C- OH 
I 

CH 3 

21 

Nitrogen Heterocycles 



2-Amino-5-(1-methyl-5-nitro-2-imidazolyl)-1,3,4-thia- 
diazole (24) has been reported active against many 
haemoflagellate infections in rodents [84]. 

I N N N 

JT~j) ii jl 

O z N N NH 2 
I 

CH 3 

3 24 

Another imidazole derivative, ketoconazole (25), 
was active in vitro at serum concentrations 1.5-4.5 
ug/ml and did eliminate all organisms at concentrations 
of 10-15 ug/ml [85]. This compound inhibits the 
demethylation of ergosterol precursors in 
promastigotes [86]. Ketoconazole at a dose of 400 mg 
per day for 6 weeks to 3 months has been reported to 
cure cutaneous leishmaniasis [87,88]. 




Some other compounds such as 1-methyl-2- 
alkoxymethyl-imidazoles (26) synthesized by Vanelle 
et al. [89] have shown high order of efficacy against 
leishmaniasis. 



Imidazole Derivatives 

Various imidazole derivatives known for fungicidal 
activity by inhibiting ergosterol synthesis have shown in 
vitro leishmanicidal activity. A well known parasiticide, 
metronidazole (22), was found active for treating 
oriental sore [78-80] and shown moderate activity 
against human cutaneous leishmaniasis infections [81]. 

N — n 

H 3 cr N N0 2 

CH 2 CH 2 OH 
22 

Niridazole (23), another imidazole derivative, has 
shown activity against L. tropica and L. braziliensis 
infections [82,83]. 
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n — N 
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CH 



CH 2 OCH 2 (CH 2 ) n CH 3 



26: n = 7 - 14 

A CDRI compound (Code No. 87/305), 1,2- 
dimethyl-3- methoxycarbonyl-4-(2-nitro-4, 5- 
dimethoxyphenyl)pyrrole (27) has shown high order of 
efficacy [90,91] in vivo after 7 and 28 days of the 
treatment at 10 mg/kg dose (i.p.). 



h 3 co 



H3CO 




COOCH3 



Acridines 

The antimalarial acridine drug, quinacrine (28) has 
been used in the treatment of oriental sore by local 



Co 
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infiltration technique [92,93] but response is slower in 
long-standing cutaneous ulcers. 



OH, 



. C^H 



2 n 5 



HN-CH-(CH 2 ) 3 -N 



H3CO. 




C 2 H 5 



28 

Several other 9-anilinoacridines, containing 1'-NH- 
alkyl substituents produced more than 80% growth 
inhibition of macrophage-infected L. major amastigotes 
at a concentration of 1u,M. 1 -Hexylamino-9- 
anilinoacridine was the least toxic compound and 
retained strong antileishmanial activity whereas 3,6- 
diamino substitution of the acridine nucleus reduced or 
eliminated leishmanicidal efficacy [94]. 

Pyrimidine derivatives 

Two pyrimidine derivatives, pyrimethamine (29) and 
trimethoprim (30) are reported to exhibit [95] 
leishmanicidal activity by inhibiting dihydrofolate 
reductase enzyme [96] of the parasite. 

Based on pattern recognition approach, Ram et al. 
have synthesized various class of pyrimidines (31-33) 
which have demonstrated high order of leishmanicidal 
efficacy against visceral leishmaniasis [97-101] in vivo at 
different doses and also possessed immunostimulant 
activity comparable to MDP. 



1 ,3,5-Triazines 

Cycloguanil (34), a drug used for antimalarial 
therapy was cured dermal lesions produced by L. 



braziliensis [102,103]. Because of its fewer side effects 
and low toxicity, it is a drug of choice for treating oriental 
sore. 

N ' N 

A J< CH3 
H 2 N /X N /N CH 3 

34 

Aminoquinoline derivatives 

A large number of 8-amino and 5-aminoquinolines 
have been studied for leishmanicidal activity. The 
standard antimalarial drug, chloroquine (35) [104,105] 
has been effective against cutaneous leishmaniasis 
and its 2-o-chlorostyryl derivative (36) [106] has shown 
considerable activity against L. tropica infections. 6- 
Hydroxymethyl-2-N-isopropylaminomethyl-7-nitro-1 , 2, 
3,4-tetrahydroquinoline (37) which was primarily 
developed as schistomicidal agent, showed high 
efficacy against L. braziliensis braziliensis in hamster 
with a marked regression of parasitaemia [107]. 

The most effective member of this series is WR 
6026 (38), an 8-aminoquinoline derivative with an 
additional ring substitution and a longer side chain. It 
has been found 400 times more active than meglumine 
antimoniate against L. donovani in hamsters, even 
when administered orally [108]. However, it is only 12- 
18 times more active than Pentostam against visceral 
leishmaniasis in mice [109]. The compound displays 
high efficacy against visceral leishmaniasis in rodents 
and dogs but is less effective against cutaneous 
leishmaniasis in experimental animals [110]. 
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Thiosemicarbazones 

Recently, thiosemicarbazones of (3-carboline-3- 
carboxaldehyde (39) and 3-acetyl-p-carboline (40) 
were found to inhibit the growth of L. donovani in vitro 
at lower doses [111]. Naturally occurring (i-carboline 
derivative, harmaline, (1-methyl-3,4-dihydro-7- 
methoxy-p-carboline) (41), has also demonstrated 
significant leishmanicidal activity against L. mexicana 
amazonesis both in vivo and in vitro in mice [112]. 



R H S 

I I II 




I 

39: R = H 
40: R = CH 3 




H CH 3 
41 



Alkaloids 

Berberine (42) and its derivatives were found useful 
for treating cutaneous leishmaniasis [113-117]. The 
amoebicidal alkaloid, emetine (43) has moderate 
activity [118,119] and dehydroemetine (44) showed 
better activity against cutaneous leishmaniasis [120- 
122]. 




Pyrido[1,2-a]pyrimidines 

Various pyrido[1,2-a]pyrimidine derivatives (45a-c) 
are recently reported to have good antileishmanial 
activity by inhibiting the growth of leishmania parasite in 
vitro model [1 23]. 




45 

(a) : R = CH 3 ; R 5 = OH; R, = R-, = R 3 = R 4 = H 

(b) : R = COOEt; R 5 = OH; R, = R 2 = R 3 = R 4 = H 

(c) : R = R 5 = CHg; R, = Br; R 2 = R 3 = R 4 = H 
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Phenothiazine (46), Dibenzazepine (47) and 
Dibenzocycloheptadiene (48) 

These tricyclic compounds, having antidepressant 
and neuroleptic activities, are reported to show good 
leishmanicidal efficacy by inhibiting the growth of 
promastigotes of L. donovani in vitro test [124]. 

i 

Ri 

i 

47 

Ri 

48 

Analogues of Purine 

Purine nucleotides such as ATP and GTP are 
essential for the synthesis of nucleic acids and 
production of energy. Since leishmania parasite is 
deficient in de novo synthesis of purine nucleotide and 
it totally depends on the host. In leishmania, purine 
nucleotides are primarily derived from phosphorylation 
of nucleosides or bases such as adenine, 
hypoxanthine, guanine and xanthine in the presence 
of purine nucleoside phosphotransferase enzyme. 
This enzyme is either absent or poorly active in 
mammalian cells. That's why certain inosine analogues 
are phosphorylated by leishmania without making any 
major change in mammalian cells. Therefore, this 
enzyme may be exploited to modify purine nucleotide 
synthesis by the parasite but not by the host. Such 
nucleotides either function as inhibitors of the essential 
enzyme required for purine metabolism or they 
incorporate into the nucleic acid of leishmania as a false 
substrate inhibiting the growth of parasites leading to 
death. 



Allopurinol and related compounds 

Allopurinol (49) is an isostere of hypoxanthine and 
used as an uricosuric drug. It is an inhibitor of xanthine 
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oxidase and has been recently found to be active 
against promastigotes at a concentration of 10 mg/ml 
[125]. Since allopurinol is less toxic than antimonials, it 
is clinically exploited for the treatment of Kala-azar 
[126]. This drug was found less effective against 
leishmaniasis caused by L braziliensis [127]. The poor 
activity of allopurinol is attributed due to its rapid 
metabolism into oxipurinol by xanthine oxidase 
enzyme. Better results have been obtained in treating 
antimonial resistant patients using allopurinol (20 mg/kg 
dose daily) in combination with sodium stibogluconate 
[128,129]. 



o 




OH OH 



50 

Allopurinol ribonucleoside (50) was shown to be 
more effective than allopurinol in inhibiting the growth 
of leishmania promastigotes in vitro [130]. Both 
allopurinol and its ribonucleoside were equally effective 
in preventing the transformation of the intracellular form 
(amastigote) of L. donovani to the extracellular form 
(promastigotes). 

Allopurinol interferes with the metabolism of 
adenine. Leishmania parasites convert allopurinol to 
the corresponding ribonucleotide in presence of 
enzyme, hypoxanthine-guanine phosphoribosyltrans- 
ferase, which arrests the synthesis of ATP, essential for 
the survival of parasite and finally gets incorporated into 
the RNA which blocks their multiplication [130,131]. 

Formycin B 

Formycin B (51), a close structural analogue of 
inosine, was found leishmaniastatic for promastigotes 
at a drug concentration of 1nM [132]. It was more active 
against amastigotes in human macrophages. A study of 
macrophages, infected with either L. major or L. 
donovani reveals that about 90% of organisms are 
removed by using 0.05 ug/ml dose of formycin B [133]. 
In addition, this drug is toxic to the host macrophages 
only at a concentration of 10 uM. Orally administered 
formycin B is 7 times more active than intramuscularly 
injected drug and 4 times more active than glucantime 
in hamsters infected with L. dono vani [134]. 



Copyrighted material 



312 Current Medicinal Chemistry, 1996, Vol. 3, No. 5 



Ram and Nath 



O 




OH OH 
51 



The mechanism of action of formycin B is similar to 
purine metabolism. Promastigotes and amastigotes 
metabolize formycin B into formycin A mono-, di- and 
triphosphate incorporated into ribonucleic acid [135]. 
However, therapeutic index of formycin B in mammals is 
not sufficiently high to suggest for human use [136]. 

Sinefungin (52) 

It is a nucleoside isolated from cultures of 
Streptomyces incarnatus and S. griseolus. It has shown 
significant leishmanicidal efficacy in vitro and 
demonstrated better activity than Glucantime [137- 
140]. The mode of action of Sinefungin involves 
inhibition of carboxymethylation of protein associated 
in membrane transport [141]. 




Agents with Immunostimulatory and 
Leishmanicidal Activities 

Renaux and Renaux [147] made an important 
observation that anti-helminth drug, Levamisole (55), 
possessed an immunostimulant activity [148]. Its role in 
certain autoimmune disease and cancer was soon 
established [149,150]. In addition, levamisole has 
been found useful in treating the more serious cases of 
visceral and mucocutaneous leishmaniases [1 50]. 

The metabolite of levamisole in mammals, 1-0- 
mercaptoethyl-4-phenyl-2-imidazolidinone (56) has 
demonstrated 4 times higher leishmanicidal activity 
than levamisole itself by enhancing the phagocytic 
activity of the reticuloendothelial system [151]. 
Conferring structural pattern of levamisole in 2- 
alkylthio-4-phenyl-2-imidazolines (57) demonstrated 
better efficacy than the model compound [151]. Thus it 
was concluded that structural pattern is responsible for 
better leishmanicidal activity than presence of a bicyclic 
system. 




The compound CP-46665-1 (58) is an 
immunostimulating synthetic lipoidal amine containing 
piperidine ring and has been evaluated in L. donovani 
infected mice. The drug, used in combination with the 
antileishmanial drug such as glucantime, led to a ten 
fold decrease in infection compared with untreated 
mice [152]. 




I 

CH, 
I 

CHO(CH 2 ) 9 CH 3 
CH 2 0(CH 2 ) 9 CH 3 
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Liposomes 

Liposomes have been proposed as carriers for 
targetting drugs and immunising substances in the 
treatment of a great variety of diseases. Liposomes are 
vesicle, generally less than 3\i in diameter, consisting 
of concentric rings of lipids separated by aqueous 
phases. Since liposomes are mainly taken up in liver 
and spleen by the cells of the reticuloendothelial 
system [153], they appear to be suitable carriers for 
targetting drugs towards protozoal infections localised 
in the spleen and liver. 

The efficacy of leishmanicidal drug is considerably 
increased if they are administered in liposomes. Tartar 
emetic [154], sodium stibogluconate [155] and 
glucantime [156-159] entrapped in lyposomes have 
shown increased levels of activity against L. donovani 
infections in mice and hamsters. 

New et al. [160] administered antifungal agents, 
Griseofulvin, 5-fluorocytosin, amphotericine B and 
pentamidine entrapped in liposomes to mice infected 
with L. donovani and observed that the liposomised 
drugs display high order of antileishmanial activity. 

Licochalcone A 

Licochalcone A (59), a novel antiparasitic agent, has 
shown potent activity against human pathogenic 
protozoan species of leishmania [161]. It is an 
oxygenated chalcone, isolated from the roots of 
Chinese licorice plant and inhibited the growth of both 
L. major and L. donovani promastigotes and 
amastigotes at 5 u.g/ml. 

Bis(benzyl)polyamine Analogs 

The intraperitoneal administration of a 
bis(benzyl)polyamine analog (60) suppressed both 
pentavalent antimony (Sb 5+ ) susceptible and resistant 
L. donovani in vivo. The oral administration of (60) to 
mice suppressed 99.7% parasites with 100 mg/kg 
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dose twice per day for 14 days [162]. L. donovani 
infections in BALB/c mice were also suppressed 99% 
after intraperitoneal dosing for 20 days with a total dose 
of 1 50 mg of (60) per kg of body weight. Thus efficacy 
of the bis(benzyl)polyamine against L. donovani by 
both parenteral and oral routes indicates that it can be 
explored as a new antileishmanial drug. 
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Abstract: The design of novel therapeutic agents to treat specific diseases 
has always been the aim of the medicinal chemist. Today, using the three- 
dimensional structure of the target-ligand complex as the main tool, it is 
possible to design novel therapeutic compounds with a specific mode of 
action. Knowledge of the molecular basis of the disease, the structure of the 
biological target (enzyme or receptor) and the mechanism of drug action are 
essential to succeed in this challenging task. 

In addition, the recent explosion of combinatorial chemistry techniques, along with the 
development of high throughput and automated techniques that permit screening of thousands 
of compounds per month against a given target, offer an unparallel opportunity for medicinal 
chemists to increase significantly the prospect of finding new bioactive molecules, in a relatively 
short period of time. 




Introduction 

The ultimate goal of modern drug discovery is to 
determine the molecular basis of a disease and to 
design, using all available techniques, a therapeutical 
entity that will correct the pathology. 

The advances in parallel of several disciplines and 
their proper integration in a multi-disciplinary approach, 
have made possible the design and synthesis of new 
therapeutics using knowledge of the structure and the 
interactive forces that govern receptor-ligand 
interaction at a molecular level. 

Although the drug discovery process is a complex 
issue, it may be reduced to three "key" steps: a) 
development of biological systems for testing the 
compounds, b) identification of the "lead" 
compound(s), and c) optimization of the "lead" 
structure(s) to obtain the candidate drug(s), suitable for 
further in vivo studies and ultimate clinical evaluation. 

Drug design has been the subject of several books 
[1] and articles [2] from different perspectives. Because 
of the vastness of the topic, we will focus only on 
certain aspects of drug design, emphasising the 
modern "rational" approach of finding potential new 
"leads" and some of the new biological targets. 



Lead Discovery 

Until recently, "new" lead discovery depended 
largely on accidental observations, fortuitous findings 
(serendipity) and the screening of large numbers of 
compounds. 



The lack of adequate analytical tools and methods to 
determine the structure of relatively simple molecules 
hampered the scientists in rationalizing their findings 
and determining structural changes to be made to 
active compounds in order to improve their biological 
properties. 

Although Ehrlich is considered to have set the basis 
for medicinal chemistry at the beginning of this century 
[3], this branch of knowledge as we understand it today 
really began in the late 1960's, when scientists started 
correlating the structure of compounds to their 
activities. Undoubtly, this was possible due to the rapid 
advances made by chemists, largely due to the 
introduction of new analytical tools for structural 
elucidation such as NMR and mass spectrometry. 

The introduction of linear multiple regression 
analysis by Hansch and Fujita [4] was the next major 
advance in rational drug design. This analytical model 
assisted scientists in reducing the initial number of 
modified structures to be investigated compared to that 
based only on "educated guesses", thus accelerating 
the iterative process of molecular modification and 
biological evaluation of active compounds. 

Early discoveries of therapeutic agents posed an 
enigmatic challenge to the scientist of that time since 
the mode of action of the drugs was completely 
unknown. Later, advances in biology, biochemistry and 
other biological sciences formed the basis of 
understanding the mechanism of action of drugs. Now, 
it is known that most of the therapeutic agents exert 
their activity by inhibiting an enzyme or antagonizing 
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(more likely) a receptor that is part of a metabolic 
process. 

In most cases, what is initially discovered is a "lead" 
compound, not the therapeutic end product. A "lead" 
is a prototype compound that has a certain desired 
pattern of biological activity but usually many other 
undesired characteristics such as high toxicity, low 
selectivity and poor pharmacokinetic characteristics. In 
the next step (lead optimization), the structure of the 
"lead" is synthetically modified in order to enhance the 
desired activity, and more particularly, to reduce or 
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eliminate the unwanted biological properties. See 
Fig. (1). 

In this stage of the drug discovery process, 
pharmacokinetic and metabolic data are critically 
analyzed and specific structural changes made to 
optimize such biological characteristics and lower or 
eliminate the unwanted properties. In this case it is 
desirable to modify those parameters of the molecule 
that change the absorption, distribution and clearance 
of the drug. One of the most important parameters is 
the lipophilicity of the compound which can be altered 
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Fig. (1). Structure-based drug design cycle. Lead compounds are either natural ligands or the result of random screening 
programs or known leads. Compound with higher binding affinity are complexed with the biological target and crystallized for 
diffraction data collection. The 3D structure of target-ligand complexes are carefully analyzed to obtain information about the 
interaction target-ligand. Based on these results, improvements in the design can be made by optimizing the hydrogen bonds, 
van der Walls and electrostatic interactions. This process is iterated until a compounds with satisfactory properties is obtained, 
and can be applied to improved existing leads or for de novo design. 
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by varying the substitution in those parts of the 
molecule that are not relevant to the receptor/enzyme 
binding. 

Another possibility is to replace a functional group in 
the molecule that is required for the affinity towards the 
receptor/enzyme for another that has a similar size, 
electronic distribution and molecular shape. This 
approach, called "bioisosteric replacement", has been 
extensively used in drug design to obtain better drugs 
[5]. Since bioisosteric groups may differ with respect to 
their lipophilicity, this replacement can yield 
compounds with improved pharmacokinetic and 
metabolic profiles [6]. 

Modern Strategies for Lead Discovery 

Depending on the most relevant feature of the 
whole process, the lead discovering can be primarily 
obtained by "design" or "screening" 

The approach chosen to get a new lead depends 
basically on the existing knowledge about the 
biological target. Usually, this depends on whether the 
expected drug has a known mechanism of action or a 
novel one. In the first case, the scientist has the 
enormous advantage of having a valuable body of 
knowledge that can be applied to accelerate the drug 
development process. 

In the second case, the approach consists in 
designing an inhibitor of an enzyme whose inhibition 
has not yet been established to obtain a desired 
therapeutical effect. To achieve this challenging goal, 
the scientist should have a good understanding of the 
biological or biochemical mechanism of the disease. In 
this instance, the proper identification, isolation and 
characterization of the new biological target becomes 
an essential step in the entire process. 

Lead Discovery by Design 

Basically, there are four classes of "active-site 
directed" inhibitors: (a) Transition-state analogue 
inhibitors; (b) Mechanism-based enzyme inactivators; 
(c) Affinity labeling agents; and (d) Metabolically- 
activated and multienzyme-activated inhibitors. 

Although there are a large number of inhibitors of 
each class [7], the transition-state analog inhibitors are, 
perhaps, the most attractive target for design. 
Pioneered by Wolfenden [8] and Lienhard [9], the 
approach relies on the synthesis of a stable compound 
whose structure resembles that of the substrate at the 
hypothetical transition state of the reaction that the 
enzyme catalyzes. It is assumed that compounds 
having such a structure will bind more tightly than will 
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the substrate in the ground state. In order to 
design an inhibitor by this approach, the medicinal 
chemist must have an understanding of the mechanism 
of the enzymatic reaction so that the scientist can 
postulate an hypothetical structure of the substrate at 
the transition state. 

Quite recently, the synthesis of renin [10], HIV-1 
protease [11] and elastase inhibitors [12] have shown 
that it is possible to design a reversible protease 
inhibitor by applying a transition state analog approach. 
Once the substrate specifity is obtained, a number of 
potent inhibitors can be synthesised by replacing the 
scissile bond of the protein by a nonhydrolyzable 
transition state isoster. 

The success in the synthesis of a potent 
angiotensin-converting-enzyme (ACE) inhibitor - a key 
enzyme that intervenes in the blood pressure 
regulation mechanism - was one of the first examples 
where the three-dimentional structural information has 
played an important role in the design of a therapeutical 
agent. Cushman et al. [13], considering the analogy of 
this enzyme to the bovine carboxypeptidase A, built an 
hypothetical model of the active site of the ACE and 
used it to design a potent inhibitor. 

Moreover, the importance of X-ray structural studies 
to explain the selectivity of the interaction inhibitor- 
enzyme was clearly established by Matthews et al. [14], 
who found that the difference of affinity of the antibiotic 
trimethoprim (TMP) 1 to the Escherichia coli 
dehydrofolate reductase (DHFR) and chicken DHFR 
could be explained in terms of a difference of only 1 .5- 
2 A between the active-site clefts of both enzymes. 
Fig. (2). 




NH ;> 

1 



Fig. (2). 

One of the modern procedures to discover new 
leads relies on designing molecules based mainly on 
the X-ray three-dimentional structure of biomolecules 
and on the application of computational strategies. The 
application of the structure-based drug design cycle 
has now become a standard strategy for the de novo 
design of novel leads or to improve existing leads 
[2a,2b,2c]. Fig. (2). 

New leads result as a consequence of an iterative 
cycle where the synthesis, biological testing and 
crystallographic analysis of target-ligand complexes 
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allows the designer to gain progressive information 
until a satisfactory compound is produced for pre- 
clinical studies. 

Although the number of protein structures 
determined by X-ray crystallography is constantly 
increasing and likely to grow rapidly in the near future, 
there are many instances where the three-dimentional 
structure of important isolated biological targets have 
not yet been elucidated. New approaches have been 
developed to compensate for the absence of detailed 
protein structure information and to correlate drug 
structure to biological activities. One of these is the 
classical quantitative structure-activity relationship 
(QSAR). This method, pioneered by Hansch and Fujita 
[4], and Free and Wilson [15], continues to be an 
important tool in drug design since it is the only method 
that provides a quantitative description of transport, 
distribution and metabolism of drugs [16]. Also, recent 
advances in computer technology and software 
development have made possible the use of molecular 
graphics to obtain information about the ligand binding 
site of an unknown receptor. Receptors or 
pharmacophores mapping techniques [2g,17], and 
more recently 3D-QSAR [16,18] methods are of great 
help in finding the bioactive conformation and, 
basically, in speeding up the analog design process. 

The value of computer modelling as a powerful tool 
to elucidate the three-dimensional structure of 
enzymes as well as to design specific inhibitors was 
convincingly proven in the synthesis of the inhibitor of 
the cercarial protease from the blood fluke 
Schistosoma mansoni - a serine protease that the 
schistosome parasite presumably uses to penetrate 
the epidermis and invades the human circulatory 
system [19]. Using the primary amino acid sequence of 
the enzyme, a three-dimensional model of the 
protease was built, taking advantage of the sequence 
similarity of the cercarial enzyme to the trypsine-like 
class of serine protease. This procedure, termed 
"modelling by homology", has been successfuly used 
when the percentage of sequence similarity of the 
known structure and the unknown one was greater 
than 40% [20,21 ,22]. 

Lead Discovery by Screening 

The fact that 60-70% of the drugs currently in use 
owe their origin to natural sources, highlight the 
importance that this highly serendipitous strategy for 
drug discovery have had and continues to have. 

Until recently, new medicinal chemical entities have 
resulted from "random screening" of natural products 
from plant extracts, marine organism, invertebrates and 
microbial fermentation [23,24]; and from screening of 
pharmaceutical company libraries of compounds. 
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In "random screening" all compounds or mixtures 
are tested in "high-throughput" bioassays without 
regard to their structures. Although this strategy has 
been criticized as inefficient [25], still today most 
pharmaceutical companies are currently screening their 
accumulated stock of chemicals for activity against 
receptors or enzymes in order to identify novel 
pharmacophores. 

In "directed-screening", the compounds to be 
screened are previously selected on the basis of a 
mechanistic hypothesis or structural information about 
the substrate, ligand or inhibitor. 

Moreover, the advent of the radioligand binding 
assay and the access to new automated screening 
techniques (usually using a 96 well plate array) that 
allow to test thousands of compounds per week, have 
reactivated this classic strategy to the current need of 
the pharmaceutical industry. 

To solve inefficiencies inherent in the present drug 
discovered process, scientists have recently focused 
their attention on the concept of chemical diversity 
both from a natural product perspective [24] as well as 
recombinant phage and synthetically produced 
peptide libraries [26,27]. 

Recent application of this approach to drug 
discovery in the fields of oligonucleotides, 
carbohydrates and peptides, is an example of how the 
chance to hit a new bioactive compound relies more on 
the number of compounds to be tested than on their 
chemical structure per se. In this case, the approach 
consist in creating huge libraries of small organic 
molecules and devising novel assay formats for their 
efficient evaluation in a short timeframe. The structural 
diversity is generated through the assembly of sets of 
building block elements ("scaffolds") by using 
chemical, biological or biosynthetic methods. 

The possibility of generating synthetic combinatorial 
libraries composed of millions of low molecular weight 
chemical compounds and in identifying the molecular 
entities that binds to the target with the highest affinity 
turns this novel approach into an extremely attractive 
strategy for lead discovery. For example, using this 
concept, Houghten and co-workers [28] identified a 
potent opioid antagonist bearing no sequence 
homology to any known opioid peptide among 52 
millons hexapeptides. Quite recently, Blondelle and 
co-workers [29], could evaluate the antistaphylococcal 
activity of two synthetic peptide combinatorial libraries 
composed of 10 millons tetrapeptides each and 
identify the individual active peptide by using an 
approppriate iterative selection process. 

Even though the field of molecular diversity has not 
yet matured, the results so far obtained in broad 
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discovery research as well as in the analoging of known 
leads are such that it can be assumed that this 
approach will provide, in the near future, an effective 
source of new molecular entities for drug discovery. 

Another potential source of lead structures comes 
from drug metabolism studies [2d, 30]. This is a 
particularly attractive source of unforseeable "leads" 
when the putative active compound is directly tested 
on intact animals [31]. The classical example of this 
approach was the discovery of the first antibacterial 
agent sulfonamide 2, which was found to be a 
metabolite of prontosil 3 [32]. Similarly, the 
antiinflamatory drug sulindac 4 is not the bioactive 
compound, but the metabolic reduction product 5 [33], 
Fig- (3). 




Fig. O). 

New Targets for Drug Design 

Diseases or symptoms of diseases arise as a 
consequence of an imbalance of particular chemicals in 
the body, from invasion of foreign microoorganisms or 
aberrant cell growth. 

To understand the mechanism of diseases at a 
molecular level is the most logical way to obtain clues to 
developing new therapeutic agents. 

The explosion of knowledge in molecular biology in 
the last twenty years, fueled by developments in 
recombinant DNA technology, has place this new 
branch of science in such a predominant position that 
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almost all biological processes can be studied from a 
totally new perspective. 

The importance of growth factors, intracellular 
receptors, signal transducers and activators of 
transcription in the proliferation and differentiation of 
mammalian cells is now being extensively studied 
[34,35,36]. Indeed, any of the essential elements in 
the signal transduction pathway are obvious targets for 
intervention in cancer. Recent success in using 
tyrosine kinase blockers and other signal interceptors 
like protein kinase C blockers, Ras blockers, Ca + 
signaling inhibitors to inhibit the growth of cancer cells 
in vivo and in vitro make the so called signal- 
transduction therapy a novel approach in the treatment 
not only of cancer but other proliferative diseases [37]. 



Twenty years ago, while scientists were dealing with 
the mechanism of gene expression in antibiotics 
producing microorganisms, genetics already displayed 
its potential value in the infectious diseases area by 
introducing genetically engineered Streptomyces in 
the production of hybrid antibiotics [38,39]. Today, 
from an almost opposite perspective, the identification 
of compounds that can inhibit gene expression in 
Pseudomonas aeruginosa (40) and the recently 
discovered antifungal Azoxybacilin 6 (Fig. (3)) as a 
gene expression inhibitor of sulfite reductase (41), 
shows that many of the factors discovered in the signal 
transduction pathway are interesting targets to 
developing new anti-infective drugs. 
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The predominant role of DNA in cellular replication 
and transmission of genetic information makes the 
nucleic acid a primary target for drug action. The use of 
synthetic antisense oligonucleotides (ODNs) for 
therapeutic purposes was first proposed by Zamecnik 
and Stephenson almost twenty years ago (42,43). The 
antisense strategy for treatment of diseases relies on 
the fact that gene expression can be accomplished by 
the binding in a sequence-specific manner to an RNA 
or DNA target (44,45,46). This binding is governed by 
Watson-Crick base pairing and, in theory, an ODN can 
be designed to target any gene of known sequence, 
potentially creating specific therapeutics to any disease 
in which the causative gene has been identified (44). 
Although the use of ODNs as therapeutic agents has 
proven to be difficult, the fact that modern nucleotide 
chemistry has recently enabled the synthesis of 
chemically modified oligonucleotide with better 
pharmacokinetics properties turned this approach into 
a very promising potential source of new therapeutical 
agents. 

Moreover, the tremendous accomplishment behind 
the first reported genome map of the bacterium 
Haemophilus influenzae open new avenues to find 
answers to fundamental questions about microbes 
(47). For example, once the sequencing of the most 
relevant microbial genome is completed, by comparing 
the genome of virulent and harmless strains of bacteria, 
it will be feasible to identify the disease-causing gene, 
hence allowing one to figure out the enzyme or protein 
closely related to the disease (48) and targeting the so 
called "virulence factors". 

The recent interest in carbohydrate-containing 
molecules in therapeutics is due to the fact that 
researchers realized that this type of molecule is 
implicated in many disease states, including auto- 
immune diseases such as rheumatoid arthritis, 
infectious diseases, inflammatory processes, peptic 
ulcers and cancer. This area has mushroomed in recent 
years, particularly in defining the role of surface 
carbohydrates in microbial and cell adhesion (49) and in 
the identification of enzymes involved in glycoprotein 
synthesis as novel targets for new therapeutics. 

Since the main carbohydrate ligand involved in 
protein-carbohydrate recognition is known as sialyl 
Lewis X, several companies are involved in using this 
knowledge to produce a n on -carbohydrate molecule 
that can mimic this compound, as a lead to new drugs. 

The first rational design of sialic acid analogues 
based on the crystal structure of influenza virus 
sialidase that inhibit not only the enzyme but virus 
replication in cell culture and in animal model (50), 
highlights the significant progress done in this area and 
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the relevance of carbohydrates as a potential source of 
valuable new therapeutics. 

Conclusion 

Since many important common diseases are only 
ameliorated by current therapy, without affecting the 
course of the disease, the need for better and safer 
drugs remains critical. 

Despite the significant progress that has been made 
towards a rational basis of drug design, the successful 
development of a new pharmaceutical agent is still a 
long and arduous process. The recent application of 
combinatorial chemistry and new screening 
methodologies in drug discovery represents an 
extremely important advance made in medicinal 
chemistry. 

Although there is a disagreement about which 
strategy will be more important in deciding the next 
generation of pharmaceuticals, the drug discovery 
process will be certainly benefited from an harmonious 
implementation of both approaches. 

Many new targets in disease processes that offer 
the possibility of finding novel and effective drugs are 
rapidly being identified. This is, in a large measure, due 
to rapid advances in disciplines such as molecular 
biology, cell biology and genetics. With a better 
understanding at a molecular level of biological 
processes (including disease processes) a more logical 
approach to drug discovery can be applied. 

The present emphasis is on finding novel and more 
effective drugs with new mechanism of action which will 
treat the disease processes rather than their 
sympthoms. 
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The predominant role of DNA in cellular replication 
and transmission of genetic information makes the 
nucleic acid a primary target for drug action. The use of 
synthetic antisense oligonucleotides (ODNs) for 
therapeutic purposes was first proposed by Zamecnik 
and Stephenson almost twenty years ago (42,43). The 
antisense strategy for treatment of diseases relies on 
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Abstract: Retinoic acid (RA) is a metabolite of retinol (vitamin A) which is 
necessary for a number of biological functions including the maintenance of 
growth and epithelial cell differentiation. It appears that most of these 
functions are mediated through formation of a complex with nuclear retinoid 
receptors which act in a steroid hormone-like mechanism. RA and its analogs 
(retinoids) have found use in treating dermatological diseases and show pro- 
mise as cancer chemopreventive/chemotherapeutic agents because of this impact on epithelial 
tissue differentiation. RA is extensively metabolized and this has a profound impact on its actions 
and activity. Thus, a reasonably detailed discussion of the quantitatively and/or biologically 
important metabolites of RA is provided as well as molecular design efforts to inhibit metabolic 
inactivation processes and stabilize metabolic activation products. The review is concluded with 
a description of current efforts in the development of more heavily structurally modified analogs, 
particularly those with nuclear retinoid receptor or receptor subtype specificity. 




Introduction 

Retinoic acid (1) is a polyene carboxylic acid derived 
from metabolic oxidation of vitamin A (retinol). 
Systematically, this yellow crystalline material is named 
as 3,7-dimethyl-9-(2,6,6-trimethyl-1 -cyclohexen-1 -yl)- 
2,4,6,8-nonatetraenoic acid. The more common 
numbering scheme for this molecule is shown in below. 
The resemblance of this compound to a long-chain 
polyunsaturated fatty acid results in its limited aqueous 
solubility while the presence of the conjugated network 
of double bonds confers a long wavelength UV 
absorption maxima of about 350 nm and a high molar 
extinction coefficient (e -40,000) when dissolved in 
most organic solvents. Also, because of this 
conjugated polyene component of its structure, 
retinoic acid is relatively susceptible to acid-catalyzed, 
thermal, and light-induced isomerization which can 
have profound effects on its biological activity and site 
of action. 




1 R=COOH 

2 R=CHO 

3 R=CH 2 OH 

Physiologically, retinoic acid (RA) is produced via 
irreversible oxidation of retinal (2), the visual pigment 
chromophore which is itself biosynthesized by 
reversible oxidation of retinol (3) [1]. With the 



exception of its visual function as well as in supporting 
some aspects of reproduction in mammals, it is 
generally believed that RA is the active form of retinol in 
controlling most of the other functions of vitamin A. It 
now appears that many of these functions result from 
regulation of transcription by non-covalent binding of 
RA to a closely related family of nuclear retinoic acid 
receptors (RARs) [2-3]. 

The main sources of vitamin A are dietary and are 
derived from provitamin A carotenoids from vegetables 
or retinyl esters from animal sources or dietary 
supplements. Symmetric, central cleavage of 
carotenoids such as R-carotene by a B-carotenoid- 
15,15'-dioxygenase yields two molecules of retinal [4]. 
It has been shown that a cellular retinol-binding protein 
type II (CRBP-II) binds this retinal and permits its ready 
reduction to retinol by a mucosal retinal reductase [5]. 
On the other hand, retinyl esters, which are often used 
as dietary supplements, are hydrolyzed in the intestinal 
lumen prior to absorption. Regardless of the source of 
retinol appearing in the enterocytes, the vitamin is 
reesterified to form primarily retinyl palmitate and 
stearate. These resulting hydrophobic esters are 
subsequently packaged in chylomicrons and secreted 
into the lymphatic system. While in this system, the 
chylomicrons are converted to chylomicron remnants 
containing the retinyl esters while the primary mode of 
removal of these remnants from the circulation is via 
uptake by specialized liver cells [6,7]. 

In the liver, two cell types play important roles in 
retinoid storage and metabolism. Initially, retinyl esters 
are taken up by the parenchymal cells (hepatocytes) 
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where a retinyl ester hydrolase cleaves the esters to 
free retinol. The resulting retinol is bound to the retinol- 
binding protein (RBP) which serves as the carrier 
protein for the transport of the insoluble ligand in 
plasma. The 1:1 RBP-retinol complex is secreted from 
the liver into the plasma when the vitamin A status of 
the individual requires it [8]. This complex circulates in 
the plasma bound to a second protein, the thyroid 
hormone binding protein transthyretin, formerly known 
as prealbumin. 

When vitamin A status is sufficient, most of the 
retinol is transferred to the second specialized cell, the 
stellate cells, perhaps while bound to RBP. Here, as 
well as to some extent in the parenchymal cells, retinol 
is reesterified as primarily palmitate and stearate esters. 
As is the case for CRBP-II bound retinol in the intestine, 
this reesterification of retinol in the liver uses CRBP- 
bound retinol as substrate [9]. 

Retinol bound to RBP in the plasma is transported 
into many peripheral tissues, particularly those that are 
vitamin A-dependent target tissues. The mechanism of 
retinol uptake is unclear. However, only in the case of 
the retinal pigmented epithelium of the eye has any 
substantial evidence for an RBP receptor been 
gathered [10]. Research in other tissues has led to the 
suggestion that uptake of retinol into cells is driven by 
the intracellular apo-CRBP concentration [11] although 
more research is needed to clarify the mechanism(s) 
involved in uptake of retinol by target tissues. 

While RA appears to be the important metabolite of 
retinol in maintaining most vitamin A functions, the 
biosynthesis in, and delivery to target tissues remains 
relatively poorly understood. While a trace of RA may 
be derived from dietary sources it is not clear whether 
this is sufficient to account for the serum RA which 
circulates bound to serum albumin at about 0.5% the 
concentration of plasma retinol levels [12]. In tissues, it 
is likely that RA is produced by oxidation of retinol 
although which enzyme system(s) is involved remains 
unclear. In some experiments, forms of aldehyde 
dehydrogenase have been shown capable of oxidizing 
retinol to RA [13]. Doubts about whethe r this relatively 
non-specific family of enzymes can regulate the small 
quantities of RA required led to the discovery that RA 
can be produced in porcine kidney cells in the 
presence of inhibitors of alcohol and aldehyde 
metabolism suggesting a more specific machinery for 
oxidation exists [14]. The details of this 
transformation(s) needs further study but no doubt 
utilizes a variety of enzymes which may differ from 
tissue to tissue. 

It has been demonstrated that in a number of 
tissues, R-carotene may be converted directly to RA 
without the presence of any detectable retina' 
intermediate [15]. While the mechanistic details of this 
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putative pathway to RA remain to be elucidated, the 
fact that human tissues can accumulate carotenoids 
raises the possibility that this mechanism may 
contribute in part to the concentration of intracellular 
RA. 

Retinoic acid has long been known to have a 
relatively short biological half-life. One of the primary 
reasons for this is the ready susceptibility of the parent 
molecule to undergo metabolism to products which, for 
the most part, appear to be catabolites with reduced 
biological activity. Thus, some efforts in medicinal 
chemistry in this area have focussed on using 
molecular design to inhibit the conversion of RA to 
inactive metabolites or to probe the role of other 
metabolites in the actions of RA. Simultaneously, the 
recognition that many or all of the actions of RA and its 
analogues (retinoids) are mediated by association of 
the ligand with a family of nuclear receptors of the 
steroid-thyroid hormone superfamily, has resulted in a 
substantial increase in drug design activity aimed at 
developing receptor-selective retinoids. These two, 
occasionally overlapping research thrusts will be the 
primary focus of this review. 

Retinoid Metabolite Studies 

Early studies of urinary and fecal metabolites 
resulting from supraphysiological doses of RA 
described many structures which resulted from both 
extensive changes in the polyene side chain as well as 
oxidations of the trimethylcyclohexenyl ring [16,17]. 
Few of these metabolites showed any useful biological 
activity and efforts shifted to study of the more 
prominent metabolites found at physiological 
concentrations. The vast majority of these metabolites 
are the result of one or more of three chemical 
conversions: 1) double bond isomerization, 2) 
oxidation, or 3) conjugation. 

As mentioned earlier, retinoid isomerizatiQn is a 
facile process chemically and physically; this is also the 
case metabolically. It has long been known that only 
certain of the possible geometric isomers of retinoids 
are thermodynamically favored [18] and that equilibrium 
mixtures of isomers can be produced photochemically 
[19]. For example, we have found that UV-induced 
photoisomerization of RA in physiologic-like solutions 
produced seven isomers we could identify at the 
photostationary state with the relative distribution 
shown in Table 1 [20]. While the relatively less favored 
11 -cis isomer has no known biological activity, 
isomerization of the 11,12-double bond of retinal (2) 
provides the important visual pigment chromophore 
1 1 -c/s-retinal. The visual photocycle has long been 
known to involve isomerization of 1 1 -c/s-retinal to the 
all-frans aldehyde [21]. Formation of 1 1 -c/'s-retinal is 
thought to occur by oxidation of 1 1-c/'s-retinol 
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Table 1. Retinoate Isomer Ratio at Photo- 
stationary State 3 



Retinoic Acid Isomer Relative Percentage 



9,11,13-Mcfe 


6.3 


11,13-d/c/s 


18.4 


13-c/s(5) 


27.0 


9,13-cfc/s 


10.3 


1 1 -c/s 


11.3 


9-c/s(10) 


10.1 


all-tans (1) 


16.5 



a From reference 20. 



produced in the eye via an isomerohydrolase which 
both hydrolyzes and isomerizes a\\-trans retinyl esters 
[22]. Probing the role of retinal isomerization in the 
visual cycle has been a fruitful area for bioorganic 
chemistry and molecular design. For example, 
restricting rotation about the side chain bonds by 
incorporating them into aliphatic rings in compounds 
such as 4, has been especially exploited for A 11 ' 12 
double bond restriction in studies of rhodopsin [23]. 




The isolation of 13-c/s-RA (5) from tissue extracts of 
rats given all-frans-RA was first reported in 1967 [24]. It 
was later found that the isolation procedures employed 
had caused extensive isomerization of RA to 13-c/s-RA 
casting doubt on its importance as a natural metabolite. 
The presence of 1 3-c/s-RA in rats was finally confirmed 
through the use of high performance liquid 
chromatography analysis [25] and the 13-c/s-RA 
concentration was subsequently assessed in human 
serum [26]. No isomerase which catalyzes the all-frans 
to 13-c/s conversion has been identified although it has 
been shown that physiological thiols such as 
glutathione can catalyze the conversion in vitro [27]. 
Nonetheless, it is now generally believed that 13-c/s- 



RA, which is also called isotretinoin, is a naturally 
occurring metabolite equivalent in biological activity to 
RA both in vivo [28] and in vitro [29]. Clinically, 
isotretinoin is currently used for the treatment of severe 
cystic acne under the trade name Accutane®. 

Despite the fact that no specific isomerase has been 
found for this all-rrans/13-c/s interconversion, because 
of the facile chemical and physical interconversion of 
these isomers it remained possible that the observed 
bioactivity of 5 was due to its isomerization in vivo to the 
all-frans isomer 1. Thus, we prepared a series of 
cyclopropyl analogs of 1 and 5 (6-9) designed to block 
this reversible isomerization and probe the role of 13- 
c/s-RA in the actions of RA [30]. These analogs were 
chosen for synthesis to minimize the introduction of 
significant steric encumbrance at the carboxyl terminus 
and in addition, because the sc^-like bonding character 
of the cyclopropane ring resembles that of the olefinic 
bond in many respects [31]. While the methyl analogs 
6 and 7 were incapable of reversing cornification of the 
vaginal epithelium in vitamin A-deficient ovariectomized 
rats, the desmethyl analogs 8 and 9 were equivalent in 
activity, if somewhat less active than ethyl retinoate 
[32]. Since model building showed that rotation of the 
cyclopropane ring of 8 and 9 into the plane of the 
polyene permits them to closely approximate the 
structure of 1 and 5 found by x-ray diffraction, their 
equivalent activity supported the conclusion that 1 and 
5 may be equiactive as distinct isomers. This 
observation may now be more interesting in light of the 
failure of researchers to demonstrate significant affinity 
of 1 3-c/s-RA for the RARs or more recently discovered 
retinoid X receptors (RXRs). 

More recent work has revealed a further important 
isomeric metabolite of RA. That is, 9-c/s-RA (10) has 
been identified as a natural and potentially important 
isomeric retinoid. The possibility that 9-c/s-RA functions 
as a bioactive retinoid metabolite was first reported by 
Levin et al. [33] and Heyman and coworkers [34]. 
These authors also demonstrated that 10 is a ligand for 
the RXRoc. The RXR receptors are a relatively more 
recently discovered receptor family closely related to 
the nuclear RAR receptors [2,3]. Further work has 
shown 9-c/s-RA to be 40 times more effective at 
activating the RXR nuclear receptor family (which has 
three subtypes labelled a, 3, and y as do the RARs) 




7 R^CHj, R 2 =H, R 3 =C0 2 Et 

8 R^H, R 2 =C0 2 Et, R3=H 

9 R^H, R 2 =H, R 3 =CQ 2 Et 
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than any other natural retinoid known, including RA 
[35]. Unlike the RARs, which bind to nuclear 
transcription sites perhaps exclusively as RAR/RXR 
heterodimers, the RXRs bind both as heterodimers 
with the RARs as well as RXR homodimers and as 
heterodimers with other nuclear hormone receptors 
such as the vitamin D and thyroid hormone receptors. 
This discovery that 10 binds to the RXRs has 
encouraged more research into the identification of 
bioactive natural retinoids. While the metabolic pathway 
for the generation of the isomeric 9-c/s-RA also is not 
known, recent synthetic efforts toward the 
stereoselective synthesis of 9-c/s-RA, such as those of 
Boehm et al. [36], have been successful in providing 9- 
c/s-RA for further biological studies. 

Because of the apparently pivotal role of the 
thermodynamically less preferred 9-c/s isomer of 1, 
efforts have recently been directed toward the 
preparation of conformationally constrained analogs of 
10. The cyclohexenyl 9-c/s-RA analog 11 has been 
described and has been found to be a specific RXR 
ligand and activator of RXR-mediated pathways [37]. 
Using more heavily modified bicyclic endgroups based 
on research evolving with more structurally synthetic 
retinoids (see below), the cyclopropane 12 [38] and 




11 



the more highly substituted 13 and 14 [39] have been 
synthesized and 12 and 13 have been found to be 
very RXR-selective ligands and activators of RXR- 
specific pathways. Using our simple cyclopropane 
isostere approach, we have made a similar study 
possible through preparation of the minimally modified 
natural product-like cyclopropanes 15 (of limited 
stability due to [3,3]-sigmatropic shift) and 16 (Wong, 
Repa, Clagett-Dame, and Curiey, unpublished results). 
Results of receptor binding studies obtained to date 
are consistent with those reported for 12-14. That 
these conformationally constrained 9-c/s-RA analogs 
show such pronounced RXR selectivity given the "pan 
agonist" actions of 9-c/s-RA with regard to binding to 
RARs and RXRs is intriguing and deserves further 
investigation. There may be an even more interesting 
further development in the potential utility of these 
types of compounds as receptor probes given the 
recent results of structural studies of the RAR retinoid 
binding site which suggest that the binding sites for 1 
and 10 on the RARs are overlapping but not 
coincident [40,41]. 

While the active metabolite RA is itself produced by 
two step oxidation of retinol, further oxidation of RA 
often results in inactivation and excretion of the more 




12 R,=H, R2=H 
13^=^3, R 2 =H 
14Rt"CHfc R 2 =CH 3 



15 



16 
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polar metabolite. In 1978, McCormick etal. isolated 5,6- 
epoxyretinoic acid (17) from the intestinal mucosa of 
vitamin A-deficient rats given [ 3 H]-retinoic acid [42]. 
Epoxide 17 was also present in significant 
concentrations in the liver but not in the kidneys of 
vitamin A-deficient rats [43]. Vitamin A-deficient rats 
given physiologic doses of [ 3 H]-retinol also 
synthesized 17 in the kidneys [44]. While it was 
concluded that 17 is an endogenous metabolite of 
retinol in the kidney, contrary to earlier views [45], it is 
now believed that this epoxidation is not necessary for 
RA action. When in vivo epoxidation in vitamin A- 
deficient rats is inhibited by N,N'-diphenyl-p- 
phenylenediamine, a free radical scavenger, RA is still 
able to promote differentiation of the vaginal epithelium 
in the vitamin A-deficient ovariectomized rat [46]. 
Furthermore, 17 was shown conclusively to be only 
0.5% as active as RA in promoting growth in vitamin A- 
deficient rats [47]. There have been relatively limited 
tests of the role of this epoxidation made via structural 
modifications of 1 designed to prevent formation of 
17. In one instance, the 5,6-cyclopropyl analog 18 was 
prepared and found to have only moderate activity in 
inhibiting the phorbol ester-induced induction of 
ornithine decarboxylase in mouse skin, although it was 
not compared with 17 in this skin cancer 
chemoprevention assay [48]. 

Biotransformation of RA to 4-hydroxy-(19) and 
subsequently to 4-oxo-RA (20) has been 
demonstrated both in vitro and in vivo [49-51]. Roberts 
and co-workers studied the formation of 19 and 20 
from RA in hamster liver microsomes [52]. The first step 
involves oxidation of RA at the C-4 position of the 
trimethylcyclohexenyl ring by an enzyme with 
properties of a P 450 monooxygenase, in that it required 
NADPH and molecular oxygen, and is inhibited by 
carbon monoxide. This has been confirmed by many 
other studies [53-55]. The second step, oxidation of 
19 to 20, involves oxidation by an enzyme whose 




17X=0 
18X=CH 2 




19R=OH 
20 R= =0 



properties are consistent with a dehydrogenase 
because it required NAD + but did not require oxygen. 
Further oxidation of 4-oxo-RAs to more polar 
metabolites also seems to be dependent on the 
cytochrome P 450 system [13,52,56]. 

The biological activity of 19 and 20 is one tenth that 
of RA in causing epithelial differentiation [51] and in 
promoting growth in rats [57]. Consequently, it is 
generally felt that oxidation of the C-4 position of RA is 
an early step in a series of metabolic transformations 
that lead to the deactivation and excretion of RA from 
the body. While 20 seems to have no "useful" 
biological function, it still retains its teratogenic effects 
[58] and has shown some positive actions in 
developmental biology experiments [59]. 

Because of the secondary allylic nature of the 4- 
position of RA, synthesis of metabolites 19 and 20 can 
also be accomplished straightforwardly by allylic 
oxidation with Mn0 2 [60] or allylic bromination followed 
by solvolysis and oxidation [61]. However, most 
synthetic efforts have been directed toward preparing 
analogs to prevent this oxidative inactivation of RA. 
Since replacement of protons with halogen atoms, 
particularly fluorine, is an often used strategy to 
minimize metabolic hydroxylation of the carbon to 
which the protons are bound. To this end, Barua and 
Olson [62] prepared 4,4-difluororetinoic (21). 
Unfortunately, perhaps because of the allylic nature of 
the fluorine atoms in 21, these atoms showed some 
slight chemical lability implying potential for metabolic 
instability and perhaps contributing to the limited 
biological activity of 21 [63]. Other 4-position 
substitutions such as the 4,4-dimethyl (22) and 4,4- 
cyclopropyl (23) analogs have, however, been 
reported to have at least moderate-to-good biological 
activity [48,64]. 

An interesting indication that the inhibition of 
metabolic inactivation of RA may be useful has been 




OOH 



21 R=F 
22R=CH 3 
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28 R= =NOCH 2 COOH 



obtained. That is, antifungal-like substituted imidazoles 
such as liarazole have been shown to have an in vivo 
restinoic acid-mimetic effect specifically by inhibiting 
the cytochrome P450 monooxygenase which 
hydroxylates the 4-position of RA [65]. Nonetheless, 
while oxidation of RA to 20 is generally thought to 
result in deactivation, Shealy and co-workers have 
recently shown that 3,3-dialkyl analogues of 20, such 
as 24, can show good retinoidal activity in cancer 
chemoprevention assays [66]. Clagett-Dame et al. have 
also recently demonstrated that both 19 and 20 show 
good-to-excellent affinity for the RARs [67]. 

To add further to the slight uncertainty about 
whether 4-oxygenated retinoids can have useful 
activity, we have found that 4-methoxyretinoic acid (25) 
and 4-acryloykxyretinoic (26) show some skin cancer 
chemopreventive potential [68]. In addition, we have 
also demonstrated that synthetic 4-(2- 
hydroxyethoxy)retinoic acid (27) can be coupled to 
epoxy-activated Sepharose 6B and used in an 8500- 
fold purification of rat testicular CRABP by affinity 
chromatography [69]. Zile and co-workers have also 
prepared an oxime of 20 (28) and found it to be very 
useful for the generation of anti-RA antibodies when 
bound to chicken IgG [70]. 

If oxidative metabolism occurring to the 
trimethylcyclohexenyl ring of RA results primarily in 
inactivation, in hindsight a fairly obvious modification 



would be to replace the ring with aromatic, substituted, 
or heteroaromatic rings. One of the first successful 
materials of this type prepared was the important 4- 
methoxy-2,3,6-trimethylphenyl analog acitretin (29) 
which is clinically employed in dermatology [71] as its 
ethyl ester etretinate (30). Interestingly, no details are 
generally provided concerning the original rationale for 
the preparation of 30 [72] and it may be that this 
effective retinoid was discovered somewhat more 
serendipitously. For example, in connection with some 
other research, we have found by chance (Sundaram 
and Curley, unpublished results) that the plant growth- 
promoter abscissic acid (31) rearranges to the 
etretinate-like compound 32 upon treatment with 
sulfuric acid in methanol. Nonetheless, 30 is a clinically 
useful retinoid which was originally identified to be 
more potent than RA in inhibiting carcinogen-induced 
rodent skin papilloma formation and has an 
exceptionally long half-life in humans leading to 
prolonged teratogenic risk [73]. While some of these 
effects may be due to the sequestration of the highly 
lipid soluble 30 in fat-storing tissues, even the active 
metabolite 29 has a long biological half-life [74], is 
teratogenic [75], and is not extensively metabolized 
[76]. The success of 30 has led to the preparation of a 
number of active analogs [71,77] although none have 
proven to be as useful as 30 or RA. 

Many retinoids also participate as cosubstrates in 
conjugation reaction with the most prevalent conjugate 




29R=OH 
30R=OCH 2 CH 3 




.COOCHj 



32 
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of RA being the carboxyl linked retinoyl-B-glucuronide 
(33). Glucuronide 33 was the first metabolite of RA to 
be identified [78,79]. This glucuronide is secreted in 
the bile when RA is administered orally to rats, 
representing up to 48% of the metabolites found 2 
hours after dosing [13,80-82]. Retinoyl-B-glucuronide 
has also been detected in the urine [83] and the 
intestinal mucosa [84] and in the plasma of fasting 
human subjects [85]. 

Biosynthesis of 33 utilizes a microsomal 
glucuronosyl transferase and occurs in liver, kidney, 
intestine, and other tissues [86,87]. Retinoyl-B- 
glucuronide has been shown to be as active as RA in 
promoting growth of retinoid deficient rats [83] and as 
an inducer of cellular differentiation both in vitro [88-90] 
and in vivo [46]. However, while 33 has been 
synthesized [91], given the known instability of O-acyl 
glucuronides [92] and of molecules like 33 in particular 
[93], it remains unclear whether this metabolite plays a 
direct role in RA activity or if it is acting as a detoxifying 
product which is excreted. For example, 33 does not 
bind to the cellular retinoid binding proteins or the 
nuclear retinoid receptors [94,95] which leaves doubt 
as to the active form of this somewhat unstable 
metabolite. It is not known if 33 is acting through some 
unknown mechanism or is being hydrolyzed to RA to 
elicit its apparent activity. It may be suggested that the 
low toxicity of 33 with respect to skin, embryonic 
development, and cells in culture may reside to some 
extent in actions as a water-soluble RA prodrug 
[90,96]. 

Table 2. Effects of Retinamidoglycosides on 
Progression of DMBA-lnduced Rat 

Mammary Tumors 8 - 0 



Compound Tumor Tumor Tumor 

Latency (days) Incidence (%) Number/Rat 



Control diet c 42 92 1.50 

RA 49 83 1.17 

34 64 58 0.92 

35 64 50 0.83 



a From reference 103. u 12 Rats/group. c Rats fed AIN-76A control diet or diet 
plus 1mmol/kg diet of retinoid from 10 days before through 110 days after 
intubation with 15 mg of 7,12-dimethylbenz[5]anthracene. 

To further explore this question, we have prepared 
the more stable amide analogs 34 and 35 [97] and 




HCH 2 CH 2 S03H 
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found them to be less toxic and much more effective as 
inhibitors of carcinogen-induced rat mammary tumors 
(see Table 2) than RA despite the fact that they do not 
bind to the RARs. As a perhaps clearer test of the 
"prodrug" issue, we have concluded the synthesis of 
C-glucoside 36 and are just completing the lengthy 
synthesis of C-glucuronide 37 (Robarge and Curley, 
unpublished results). Results of analysis of the 
biological properties of these molecules will be 
forthcoming. 




33 R=COOH, X=0 
34R=CH 2 OH, X=NH 

35 R=COOH, X=NH 

36 R=CHpH, X=CH 2 

37 R=COOH, X=CH 2 

Taurine conjugates of retinoids are also known, 
such as 38 and 39, but it is not clear what, if any, 
activity they have [98,99]. The medicinal chemistry of 
another interesting RA conjugate has perhaps been 
neglected. Retinoylation of proteins has been 
suggested to be another mode by which RA acts on 
cells [100-106]. The metabolic intermediate governing 
the retinoylation has been suggested to be retinoyl- 
Coenzyme A which can transfer the retinoyl moiety to 
proteins [107]. This retinoyl-CoA has been 
synthesized [108] but its significance has yet to be fully 
established. Hypothetically, other metabolic 
intermediates may be involved in retinoylation, 
including retinoyl-B-glucuronide. Thus, whether 
retinoyl-CoA is a true biosynthetic intermediate, what 
amino acid residues are retinoylated, and what the 
functional role of retinoylated proteins are, is an area 
that warrants further attention. 



Synthetic Retinoid Analogs 

The discovery of the nuclear retinoid receptor 
proteins and the relatively selective distribution of their 
subtypes in different tissues and cells has contributed 
to a significant increase in efforts to synthesize new 
retinoid analogs. These efforts are now based on the 
hope that disease or tissue specific retinoids may be 
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61 



the selective activity of 53 has turned out to have 
greater significance in the more recent past (see 
below). 

Further evolution in this structural class has resulted 
in even more extensive restriction of side chain 
flexibility by preparation of molecules such as 58 and 
59. In particular, the tetrahydrotetramethyanthracenyl- 
benzoic acid 58 (TTAB) has recently been shown to 
have comparable or improved binding affinity for the 
RARs and somewhat improved efficacy in the F9 cell 
assay relative to RA [127]. Changes in the aromatic ring 
structure of TTAB to include heteroaromatic rings, such 
as the furan 60, has also resulted in compounds with 
differentiation-inducing activity comparable or greater 
to TTNPB or RA [128]. Subsequent development of 
this structural type has led to the observation that the 
tetrahydrotetramethyl ring in structures like TTNPB can 
be replaced by ortho substituents and retain excellent 



bioactivity. For example, Charpentier and co-workers 
also recently disclosed [127] that the adamantyl- and 
methoxy-substituted analogue 61 shows high 
receptor affinity and differentiation-inducing activity. 
This ortho substitution pattern on TTNPB-like 
molecules (one moderately polar and one highly 
hydrophobic group) can also be replaced by two 
moderately hydrophobic residues, e.g. isopropyl 
groups. 

Isosteric replacement of the retinoid carboxylate 
group with sulfur or phosphorous acids has also been a 
popular approach with heavily modified retinoids such 
as TTNPB. For example, sulfonic and sulfinic acid 
analogues 62 and 63 show good antipapilloma activity, 
although ethyl sulfone derivative 64 proved to be the 
most effective compound in this series [129]. It was 
suggested that this activity of 64 was dependent on 
oxidation to molecules like 62 and 63 in vivo. 




62 R=SO^H 

63 R=SO;H 

64 R=SO^CH 2 CH 3 



66 



COOH 




COOH 
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Considerable early interest in sulfone 64 was also 
based on its favorably low bone toxicity [130] although 
other teratogenic malformations it causes may have 
slowed its development [131]. 

An important area for further investigation, which 
has evolved from TTNPB, has been exploration of the 
required linker between the two aromatic rings. That is, 
is the propenyl group necessary for the activity of 
TTNPB-like molecules? Shudo and co-workers have 
convincingly shown this is not the case by synthesizing 
amides 65 and 66 (known as Am80 and Am580) [132]. 
These amides show 3.5 and 7 times the efficacy of 
retinoic acid in inducing HL-60 cell differentiation. Even 
further evolution in the acceptable structural 
substitutes for the propenyl linker has been more 
recently demonstrated by the group of Shudo [128]. 
That is, the propenyl group can be replaced by an azo 
linkage producing an azobenzene analog such as 67 
which has 1.3 times the activity of retinoic acid in 
inducing HL-60 cell differentiation. 

Shortly after the discovery of the three major 
subtypes of the RAR (a,B,y), a second class of retinoid 
receptor, termed RXR, was also discovered in three 
major subtypes (a,B,y) [2,3]. As mentioned above, it 
was subsequently proposed that the isomeric 
metabolite 9-c/s-RA (10) is the natural ligand for the 
RXRs while RA preferentially binds to the RARs. While 
it is not yet completely clear what the significance of this 
variety of retinoid signalling pathways is, the differential 
tissue distribution of certain of these receptors has led 
to the hope that receptor or tissue selective retinoids 
might be discovered which would have selective 
actions or selectively reduced toxicities. With the 
discovery of the nuclear receptors, it was subsequently 
established that the potent retinoids TTNPB (50), 
TTAB (58), and the bisnaphthalenyl carboxylic acid 
TTNN (59) were relatively selective ligands for the 
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68 R=H, X=0 

69 R=H, X=CH 2 

70 R=CH 3 , X=0 

71 R=CH 3 , X=CH2 
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RARs [133] and activators of RAR-selective pathways. 
Subsequently, it was recognized that molecules like 3- 
methyl TTNPB (53) have selective affinity for the RXRs 
[134,135]. This observation may account for the earlier 
noted selective action of 53 in differentiating F9 but 
not HL-60 cells (see above). Furthermore, it is likely this 
RXR-selectivity occurs because the presence of the 3- 
methyl group of 53 causes the more planar stilbene 
moiety of TTNPB (50) to now be twisted into a more 9- 
c/s-RA-like conformation. Using this type of rationale, a 
number of research groups are now developing similar 
molecules with high selectivity for RXR 
receptors/pathways which resemble 53 but are pre- 
organized presumably into a more 9-c/'s-RA-like 
conformation. For example, Dawson et al. [136] and 
Boehm and co-workers [137,138] have been 
developing compounds such as 68-74 which show 
very high RXR-selectivity. In fact, 71, designated as 
LGD1069, is the first RXR-selective retinoid to enter 
clinical trials for the treatment of certain cancers. 

Replacement of the benzoic acid ring of TTNPB and 
3-methyl TTNPB with heteroaromatic rings has also 
recently been a successful approach in the 
development of new retinoids with unique properties. 
Certainly, this approach has resulted in the highly RXR- 
selective ligand 74. In addition, a number of other 3- 
methyl TTNPB analogs with heteroaromatic rings have 
been described which show either RXR-specificity (75- 
77) [see Fig. (5)] or, interestingly, RXR and RARB.y 
selectivity (78 and 79) [139,140]. The preparation of 
retinoids with receptor subtype specificity as opposed 
to RAR or RXR selectivity has also been identified as a 
goal because the selective tissue distribution of 
subtypes may permit development of tissue or disease- 
specific retinoids perhaps with reduced toxicities. In 
addition to 78 and 79, recent successes in this area 
have been primarily with the development of RARB.y- 
selective agents although RARcc-selective agents such 
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Abstract: Over the past two decades, medicinal chemistry has witnessed a 
fundamental renaissance with regards to the methodologies and 
approaches to drug design and discovery. Rational design based on compu- 
tational chemistry and solid state structure of receptors and enzymes is still 
an important approach in this field. Unfortunately, bacterial, viral, and cell re- 
sistance to drugs as well as the appearance of previously unknown diseases 
have dramatically increased the demand for new and more potent drugs. On the other hand, the 
random screening of chemical compounds from natural sources has not only become limited in 
terms of diversity but also in terms of general methodologies for the rapid identification of the 
potent molecule. For these reasons, combinatorial chemistry, the science of diversity and 
"rational screening", appears to be the natural direction in order to overcome these limitations. 
Combinatorial chemistry has attracted researchers from areas as remote as materials science and 
catalysis and will surely bring more advances in the future. In this review we will discuss the 
methods for the generation of diversity along with screening processes associated with these, 
some applications in medicinal chemistry, and molecular recognition and catalysis. 




Introduction to Combinatorial Chemi- 
cal Sciences [1] 

The essence of drug discovery is to find, as quickly 
and cost-effectively as possible, the most potent, 
selective, bioavailable, biocompatible, non-toxic, 
chemically accessible, competitive, patentable, and 
marketable compound(s). Medicinal chemistry has 
evolved through the centuries following these 
guidelines, from the use of natural product extracts with 
known healing properties [2], to the most sophisticated 
computer aided drug design (CADD) [3] based on 
quantitative structure activity relationships studies 
(QSAR) and the molecular bases that underline 
receptor-ligand interactions. Chemical and crystal 
structure analysis of isolated receptors and their ligands 
have revealed the mechanism of action and 
established qualitative and quantitative structure 
activity relationships. Understanding how to enhance 
these interactions, the synthetic organic chemists 
rationally produced new and more potent molecules in 
response to drug resistance, the appearance of new 
diseases, and growing demand for new alternatives. 
Although heavily dependent upon an existing 
knowledge of the molecular and the structural bases of 
receptor-drug interactions, CADD became a powerful 
tool for predicting what family of molecules may 
produce a successful drug lead and served as a source 
of inspiration for the medicinal chemist. In the mean 
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time, en masse screening of hundreds of thousands of 
compounds from in-house collections of pure synthetic 
or natural products isolated from fermentation residues, 
fungi, plants, and other sources, afforded several 
opportunities for the discovery of new and potent 
leads, especially when no information was available on 
the molecular basis of the interaction between the host 
and guest. Although still a very productive approach to 
drug discovery, high throughput screening techniques 
(HTS) [3a], with all the advantages of ultra-modern 
technological advances in robotics and artificial 
intelligence, are facing important new challenges. Most 
importantly these are: 

i) Maintaining a ready supply of new compounds. 
Today's technology and instrumentation allows 
the mass screening of ~10 6 samples per annum 
which corresponds approximately to the size of 
an in-house library of a large pharmaceutical 
company. These libraries are inherently 
restricted in terms of diversity because they 
issue from previously prepared compounds with 
a specific designation in mind. 

ii) Design of increasingly sophisticated, rapid, and 
cost-effective screening procedures [4], and 
subsequent methods for isolation and 
characterization of the active molecule. 



© 1996 Bentham Science Publishers B V 



Cop 



344 Current Medicinal Chemistry, 1996, Vol. 3, No. 5 

iii) Efficient handling of massive amounts of 
valuable information that can be extracted from 
these screenings. This information may be 
crucial in future studies to predict mechanisms of 
action from compound structures [5]. 

Combinatorial chemistry is proposing new solutions 
to these problems, and in the ideal case, may even be 
the ultimate solution to drug design and discovery (vide 
infra). Combinatorial chemical sciences are, in essence, 
the expeditious creation of large libraries of molecules 
with biological [1], chemical [1], catalytic [6], or physical 
[7] properties, followed by "rational screening" to 
identify the relevant molecule. Several factors have 
contributed to the emergence of this field. Some may 
consider it to be simply the natural evolution of the field 
of drug design and discovery. Its application is, 
however, a complex interplay of rational drug design, 
pure solution and solid phase synthetic organic 
chemistry [8], molecular biology, biotechnology in 
association with the powerful tools of in vitro 
reproduction and evolution, receptor development, 
and fundamental advances in miniaturization, robotics, 
artificial intelligence, physics, and mathematics. 
Together, these fields have convergently contributed 
to the emergence of this powerful tool. 

According to the United States Office of 
Technology Assessment, it takes an average of 12 
years and $359 million (approximately $7500 per 
compound tested) to discover, develop, and market a 
new drug. 10% of this amount is actually invested in the 
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early stage of the lead discovery. Combinatorial 
chemistry hastens this early phase and reduces the 
overall cost of discovering a new drug. Financially 
speaking, this is not the most exorbitant step, but 
saving up to two years in the overall process might be 
very profitable for the company, especially when one 
takes into consideration that several companies are 
usually developing similar drugs and that only the very 
first to market the product will make the largest benefits. 
In addition, not only does the combinatorial approach 
accelerate and lower the cost of the process, it also 
provides a repertoire of leads and targets which may be 
chemically very diverse and can serve as alternative 
drugs in future developments. Fig. (1 ) shows 
schematically the fundamental difference between the 
traditional approach to drug discovery and the 
combinatorial approach. At each portion of the 
trajectory, whatever the approach followed, there is a 
multitude of steps involving the arsenal of techniques 
traditionally used in the drug discovery process. Once 
the initial target is attained, it may either enter the 
preliminary phase of clinical trials, or reenter the cycle of 
drug improvement. 

This review will describe this new field of chemical 
sciences in the context of medicinal chemistry and 
molecular recognition and catalysis. We will first address 
the key issues concerning library design and 
screening. The following section will outline the most 
contemporary developments of combinatorial sciences 
involving biological systems. The last section will 
include our recent work and the efforts of others to 
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Fig. (1). A comparison between the traditional (left) and combinatorial (right) approaches to drug design and discovery. Each 
portion of the trajectory may involve the standard arsenal used in drug design (CADD, QSAR, solid state and solution phase 
structural studies). 
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combine combinatorial chemistry with supramolecular 
recognition and catalysis. 



Design and Preparation of a Chemical 
Library [9] 



General Methodologies 

The question regarding what makes a library eligible 
for screening must be addressed first and foremost. 
There are three properties that a library should meet: i) 
Its ease of construction, size, and chemical diversity, ii) 
Each component of the library should be equally 
represented and present in sufficient quantity to give 
rise to a measurable response, iii) The active 
components of the library should be easily identified 
and structurally characterized. To accomplish this 
crucial step, a highly sensitive and specific assay 
should be developed at the same time or even before 
the construction of the library. This point will be 
discussed in the next section. 

The fundamental basis of combinatorial chemistry is 
the creation of an exponentially increasing number of 



distinct molecules after each synthetic step. There are 
several methods to create a large number of molecules. 
They can be divided into two groups, i) multiple parallel 
synthesis, and ii) split synthesis. In each case the library 
can be constructed using solution or solid phase 
synthesis. This section will describe each one of these 
approaches. It was estimated that 10 200 is a 
conservative estimate of the number of small (MW 
<750) organic molecules made possible by applying 
the rules of valence to carbon and its neighbours on 
the periodic table [10]. One, and only one of these 
molecules should possess the optimal properties of a 
drug candidate. It is safe to profess that such complete 
and diverse repertoire is in all respects inconceivable 
for the mere reason that this astronomical number is 
approximately 10 128 times the weight of the universe! 
Therefore, the "rational" design of the library becomes 
a prerequisite for its success in producing an entity with 
the desired properties. The preparation of a library 
preferably involves parallel synthesis so that the 
number of compounds prepared is greater than the 
number of chemical steps required. There are 
numerous methods to prepare a library and most of 
them are variations on the same theme: the portion 
mixing method [11], called also the split synthesis [12], 
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Fig. (2). Combinatorial split synthesis involving 3 different sets of molecules. Each set is used for the combinatorialization 
(randomization) of one position of the library. The solid spheres represent the matrix to which the building blocks (squares) are 
sequentially attached. 12 compounds are obtained in 7 chemical I 
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Fig. (3). Permutational split synthesis involving a set of 3 building blocks for the combinatorialization (randomization) of each 
position of the library. The solid spheres represent the matrix to which the building blocks (squares) are sequentially attached. 
27 compounds are obtained in 9 chemical steps. 



one bead-one compound approach or Selectide 
process [13], split-and-mix, or divide-couple-recombine 
(DCR) [14]. The tea-bag method [15], the pin [16] and 
Diversomer [17] technologies, and light-directed 
spatially addressable parallel synthesis [18] will be also 
discussed in this review. 

The Split Synthesis 

The split synthesis using an oligomerizable set of 
building blocks (e.g. amino acids) can lead to 
combinatorial or permutational libraries. The first uses a 
combination of building blocks from different sets 
leading to a well defined order, sequence and length 
(Fig. (2)). The second one uses the same set of 
molecules and produces all possible sequences with all 
lengths and orders within the sets (Fig. (3)). The solid 



spheres represent a matrix to which the building blocks 
(squares) are sequentially attached after each mixing 
and splitting step. In practice, the resin is divided into as 
many building blocks as there are to be used in the 
library synthesis. Each portion of the resin is coupled 
quantitatively to one building block and then all the 
portions are mixed and split as before. The scheme is 
repeated as many times as there are chemical steps in 
the synthesis of the target molecule (e.g. 3 steps for a 
tripeptide). 

The advantages of combinatorial versus 
permutational synthesis is that compounds from 
chemically distinct building block sets can be 
assembled to form substances that are impossible to 
obtain from permutational synthesis, whatever its 
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diversity. Permutational libraries use the same 
chemistry to combine molecules of the same set which 
allows the preparation of libraries of different size order 
and length. Combinatorial and permutational oligomeric 
libraries (peptide oligonucleotides) are directional 
because the order of introduction of the building 
blocks defines each compound, for instance ABC and 
CBA are two distinct components. A three amino-acid 
set (Ala, Phe, Leu) generates 27 (3 3 ) different 
permutational tripeptides, including Ala-Leu-Phe and 
Phe-Leu-Ala, which are distinct molecules due to the 
directionality of the oligomer. This inherently increases 
the size of the library simply by extending the length of 
the molecules, but limits its diversity. 

Following the split synthesis scheme shown in Fig. 
(2), a directional oligomeric combinatorial library built 
from the sets of compounds S-|, S2,..., S n containing 
respectively Nj, N2,..., N n building blocks will involve N-| 
+ N 2 +...+ N n chemical steps and will contain N<\ x N 2 
x ... x N n members. For instance, a directional 
combinatorial library built upon the generic formula Sv 
S2-S3 where St is a set of 2 building blocks (a,b), S 2 a 
set of 3 building blocks (c,d,e) and S 3 a set of 2 building 
blocks (f,g), will contain 12 (2 x 3 x 2) members and 
will involve 7 (2 + 3 + 2) chemical steps [19]. A 
directional permutational library of an oligomer of length 

L, built from a set of N components (a,b n) will 

involve N X L chemical steps and will contain N L 
members. For instance, a directional permutational 
library built upon the generic formulae S-S-S were S is a 
set of 3 building blocks (a,b,c), will contain 27 (3 3 ) 
members and will involve 9 (3 x 3) chemical steps as 
shown in Fig. (3). 

A library can be evaluated by its size and diversity. The 
efficiency of a split synthesis may be defined as the 
number of compounds per chemical step [20], and is 
given by Equation (1) for a combinatorial library and by 
Equation (2) for a permutational one. 

i = n 

N,XN 2 X... XN n |J} 

n 1 + n 2+ ». +N n i n Equation (1) 

i = i 

nxl L Equation (2) 

The size of a directional combinatorial synthesis is 
sensitive to the number of sets and their size. Naturally, 
a combinatorial synthesis incorporates more structural 
variety through the use of diverse sets of building 
blocks and synthetic steps. The directional 
permutational split synthesis is more sensitive to the 



length of the synthesized molecule. Many more 
compounds per chemical step are prepared through 
this approach (Table (1) and Fig. (4)). Because of the 
uniformity of the chemistry involved throughout the 
synthesis of the oligomer, this approach does not 
explore the structural and conformational space as 
effectively as a combinatorial library. 

Table 1. Size of Permutational Peptide Libraries 
of Various Lengths Obtained Using 
Sets of Molecules of Different Sizes 



Size 

)f the Number 0< possible oligomers from a set of N molecules 

Mt 



N Dimers Trimers Tetramers Pentamers Hexamers 



20 


4.0 X10 2 


8.0 x 10 3 


1.6x10 5 


3.2 x10 6 


6.4 x10 7 


40 


1.6 X10 3 


6.4 x10 4 


2.6 x10 6 


1.0X10 8 


4.1 x10 9 


eo 


3.6 X10 3 


2.2x10 5 


1.3 x10 7 


7.8x10 8 


4.7 X10 10 


80 


6.4 X10 3 


5.1 x 10 5 


4.1 x 10 7 


3.3 x10 9 


2.6 x10 11 


100 


10 4 


106 


108 


10 1 ° 


10 12 



In practice, split synthesis based methods generate 
libraries of resin bound compounds where each bead 
carries only one type of compound. The compounds 
can be completely detached or left on the bead for 
biological, catalytic, binding or other types of assays. 
They can also be partially detached leaving the other 
part for subsequent characterization studies or other 
assays [21]. The split synthesis strategy has already 
generated several drug leads (vide infra), some of 
which have already entered clinical trials such as a factor 
Xa inhibitor [22]. 

Library of Libraries or One Bead-One Motif 
Approach [22] 

This concept in library design was introduced to 
thwart the limitation imposed by solid phase chemistry 
on the library size. A split synthesis scheme to 
generate a decapeptide permutational library using a 
set of 20 amino acids building blocks will necessitate at 
least -10 14 resin beads (one bead per compound). 
Using 130 urn diameter beads, this number would 
correspond to 10.2 tonnes of resin (Table 2), a quantity 
that no industrial or academic institution can possibly 
envisage. As will be briefly discussed in the last section 
of this review, this is also the raison d'etre of what has 
been recently termed representative libraries [23]. 

It is, however, unnecessary to construct the entire 
library in order to identify active components because 
the number of active beads (compounds) appears to 
depend on the number of critical residues or 
pharmacophores in the peptide sequence required for 
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Table 2. Number of Beads and the Weight of Resin Required, Using One Bead-One Compound 
Approach and One Bead-One Motif (Library of Libraries) Approach to Prepare Permutational 
Libraries of Compounds of Length Varying from 3 to 10 Residues, Using a Set of 20 Amino 
Acid Building Blocks 



One bead-one compound One bead- one motif 

Length Number of beads Weight of resin Number of beads Weight of resin 
(building blocks) (X 10 -3 ) (gj (X 10 -3 ) (g) 



3 


8 


0.008 


8 


0.008 


4 


160 


0.16 


32 


0.032 


5 


3200 


32 


80 


0.08 


6 


64000 


64 


160 


0.16 


7 


1280,000 


1.3X 10 3 


280 


028 


8 


25,600,000 


2.6 X 10 4 


448 


0.448 


9 


512,000,000 


5.1 X 10 5 


672 


0.672 


10 


10240,000,000 


1.0 X 10 8 


960 


0.960 



a minimal measurable interaction with a given receptor, 
but not on the length of the peptide sequence. A 
hexapeptide, for instance, may very well possess 3 
structural building blocks that impose a certain 
conformational state, and 3 pharmacophore building 
blocks that interact directly with the receptor. The 
combination of these two "motifs" would contribute 
more or less to the specificity and stability of the 
complex. 

The basic idea of this concept is the preparation of a 
library of compounds having only a limited number of 
pharmacophores (building blocks with residues that 
may interact directly with the receptor) combined with 



structural building blocks. The next step in this 
approach was first to define how many and which amino 
acids of the set should be structural and/or 
pharmacophores and second, how many 
pharmacophores per sequence (compound) are 
necessary to induce a measurable response in an 
appropriate assay. These are, unfortunately, questions 
without definite answers since every amino acid can 
contribute to the structure/conformation of the peptide 
as well as interact directly with a given receptor, and the 
number of pharmacophores per structure may vary from 
one compound to another. The solution that was 
proposed for these problems was to use mixtures of 
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Fig. (4). Graphic representation of the data from Table 1. Size of a permutational peptide library of various lengths (dimers- 
hexamers) obtained using a set of amino acid building blocks of various sizes (20-100). 
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building blocks in the structural positions and defined 
building blocks in the pharmacophore positions; in this 
manner, libraries of motifs will be generated instead of 
libraries of compounds. It implies also that each bead 
will contain a multitude of compounds instead of one, 
and hence the transition from one bead-one 
compound strategy to one bead-one motif strategy. 
This approach allows the coverage of all possibilities at 
the structural positions and all combinations of 
pharmacophore building blocks within the sequence. 

With regard to the number of pharmacophores, a 
starting point has to be defined. For instance, half of 
the positions could be considered as pharmacophores; 
subsequent studies will show if this distribution (50/50 
pharmacophores/structural positions) is generating 
more or less hits. In any case, this distribution can be 
readjusted. Literature precedents have shown that this 
is a reasonable ratio [24]. 

Upon fixing the number of pharmacophores and 
structural positions within a sequence, the number of 
active compounds can be estimated according to 
Equation (3): 



Table 3. All 20 Possible Arrangements 
({3 + 3}!/{3!x 3!}) of a Motif Made of 
Three Pharmacophores (P) That Can 
Be Obtained Within a Hexapeptide 
Structure in Which S Denotes a 
Structural Unit 



1 


PPPSSS 


11 


sppssp 


2 


PPSPSS 


12 


SPSPSP 


3 


ppQQpc; 

rrouro 


13 


cpccpp 


4 


PPSSSP 


14 


SPPPSS 


5 


PSPSSP 


15 


SPPSPS 


6 


PSSPSP 


16 


SPSPPS 


7 


PSSSPP 


17 


SSPPPS 


8 


PSPPSS 


18 


SSPPSP 


9 


PSPSPS 


19 


SSPSPP 


10 


PSSPPS 


20 


SSSPPP 



s 

H = XXP ' X ~(a7 Equation (3) 

where H is the expected number of positive hits (beads 
carrying an active peptide), x is the number of different 
motifs (group of pharmacophores or critical residues), 
Pf is the placement factor or the number of possible 
placements of each motif (group or cluster of 
pharmacophores) within the peptidic chain, S is the 
number of beads screened, A is the number of building 
blocks of the set used for randomization (vide infra), 
and n is the number of positions used as 
pharmacophores. 

To illustrate this methodology, let us consider a 
hexapeptide library made out of a set of 20 amino acid 
building blocks with the implicit assumption that all 20 
amino acids can act as pharmacophores or structural 
building blocks. Let us suppose that there are 3 
pharmacophore positions (P) and 3 structural positions 
(S). Table 3 shows all 20 possible combinations of 3 
pharmacophore positions and 3 structural positions 
within the hexapeptide. This number (Pf) can be 
obtained using Equation 4 where p is the number of 
pharmacophore positions, s the number of structural 
unit positions, and p+s is the total number of positions 
(number of residues in the peptide). 

p = (p + s)! 

ol x st Equation (4) 



Each arrangement can be considered a "positional" 
sublibrary (vide infra) where the P positions are 
introduced using a split synthesis scheme from the set 
of 20 amino acids and the S positions are introduced as 
mixtures of the 20 amino acids set under conditions 
that allow equal incorporation of each building block 
(vide infra). Each one of these sublibraries will involve 3 
split synthesis steps and 3 random couplings. Each 
bead from each sublibrary will no longer contain a single 
hexapeptide in / copies but a mixture of 8000 (20 3 ) 
hexapeptides in i/8000 copies each, where now only 
the P positions are fixed. In other words, each bead has 
become a library of hexapeptides with 3 fixed 
pharmacophore positions. Since the overall strategy 
will involve 60 (3 x 20) split synthesis steps which is 
very tedious and time consuming, an algorithmic 
diagram was also developed that allows the 
combination of identical coupling steps so that the 60 
split synthesis steps required in the case of a 
hexapeptide library were reduced to only 12 steps. The 
number of steps is related to the length of the peptide 
and the number of pharmacophores (p) and structural 
building blocks (s) via Equation (5). 

Number of steps = p x (s + 1 ) Equation (5) 

The size L of a library of libraries which corresponds 
to the total number of possible motifs is determined by 
the number of permutations in a library with one 
positional motif (AP) by the number of positional motifs 

(Pf)- 



Copyrighted material 



350 Current Medicinal Chemistry, 1996, Vol. 3, No. 5 



Hicham Fenniri 



L = aP x p ( Equation (6) 

The application of Equation (6) to the above 
example gives 160,000 (20 3 x 20) members for a 
library of libraries. This number is much smaller than the 
corresponding permutational library that would 
generate 64,000,000 one compound-one bead (L = 
AP +S = 20 3+3 ). See Table (2) for comparison between 
the two approaches. The screening of this kind of 
library would identify a motif within a random sequence 
rather than a single defined sequence. A second 
generation library would identify the best building 
blocks at the structural positions. Also, depending on 
its size, this library can be "one bead-one compound" 
or "one bead-one motif". 

The application of Equation (5) to the case of a 
hexapeptide library of libraries with 3 P and 3 S 
indicates that there would be 1 hit every 400 beads: 

H = 1 X 20 X S/(20) 3 ==> S = 400XH 

Light Directed Spatially Addressable 
Chemical Synthesis [1j,25] 

This methodology combines standard solid phase 
synthesis and photolithography technology. The 



members are physically segregated and geographically 
located on a matrix surface. Using photolabile 
protecting groups on each of the building blocks for 
solid phase synthesis in combination with a set of 
masks, the spatial resolution of photolithography allows 
for very localized activation of small derivatized 
surfaces, as shown in Fig (5). After regioselective 
deprotection of the matrix surface using a set of masks, 
the whole surface is submitted to reaction with a 
building block. The same or another surface area is 
photodeprotected using different sets of masks and 
coupled with another building block. Fig. (5) shows that 
the masks have only one window and are always 
oriented vertically, but in practice all orientations are 
possible with one or more windows. The pattern of the 
masks and the sequence of reactants defines the 
products and their location. After the final deprotection 
the functionalized matrix is incubated with a labelled 
receptor that can be localized upon binding its ligand, 
allowing at the same time the identification of the ligand 
according to its location on the surface. The advantage 
of this method is that it requires only micropreparation 
and small consumption of chemical reagents and 
biological materials since on 1 cm 2 , 40000 compounds 
can be spatially located with high resolution. 
Combinatorial masking strategies can be used to form a 
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Fig. (5). Light directed spatially addressable parallel synthesis for the generation of chemical libraries. X represents a 
photocleavable protecting group. 
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Fig. (6). Light directed spatially addressable combinatorial synthesis for the generation of a chemical library of all possible 
dimers from a set of 20 building blocks. 



large number of compounds in a small number of 
chemical steps. For instance, all 400 (20 2 ) possible 
dimers from a set of 20 amino acids can be obtained in 
40 (20X 20) masking/photodeproteting/coupling steps 
as shown in Fig. (6) [18,26]. This strategy follows the 
equations described above for a split permutational 
synthesis: The total number of compounds is equal to 
N L , where N is the number of building blocks in the set 
and L the length of the peptides. L is also equal to M/N 
where M is the number of cycles. 

Multiple peptide synthesis 

There are several methods for producing large 
numbers of peptides and are all based upon improved 
versions of standard solid phase synthesis techniques 
[8]. 

Multipin synthesis 

Multipin synthesis is the first method introduced for 
the preparation of large peptide libraries [16]. It is based 
on parallel peptide synthesis on polyacrylate grafted 
polyethylene pins arrayed in a microtiter plate format. 
The spherical detachable heads of the pin can hold up 
to 2 umoles of the growing peptide [27]. The peptide 
can be left on the pin or detached for biological assays 
[28]. This method also called mimotope strategy (from 
epitope mimetic), has found wide application in epitope 
mapping [29], specific ligand discovery for various 
types of receptors, as well as many other applications 
[30]. 

Diversomer technology [17] 

It is an automated approach to the simultaneous 
synthesis of a large number of molecules. The 
apparatus is composed of a series of gas dispersion 
tubes to contain the resin during and following the 
reaction cycles. The tubes are immersed in a reservoir 



block containing the reagents and solvents that diffuse 
towards the resin through the glass frits located at the 
bottom of the tubes. This approach allows for a facile 
separation of the resin from the reaction mixture at the 
end of each reaction cycle. Once the reaction block is 
attached to the tubes, they become a single unit that 
can be agitated, heated or cooled as necessary. The 
whole apparatus can be placed under inert atmosphere 
in a manifold. A similar approach that uses 
polypropylene deep well plates was recently reported 
[31]. 

Tea bag method [14,15,32] 

It consists of the physical separation of compounds 
attached to a polymeric matrix inside sealed porous 
polypropylene bags. The bags are successively 
immersed in solutions containing one or several 
activated building blocks at a concentration that favors 
comparable incorporation for each one of them [32b]. 
The bags can then be washed, worked-up collectively, 
and submitted to another reaction cycle. The 
advantage of this approach is that the physical 
separation allows for differently functionalized matrices 
to be reacted with the same building blocks (parallel 
synthesis). It has also the advantage of producing 
relatively larger amounts of soluble and characterizable 
material (up to 500 umol) for subsequent biological 
studies. This method, which has also been automated 
[33], was successfully applied in various studies (see 
next section). 

Multiple peptide synthesis through the 
random coupling of amino acids mixtures 

This method consists simply of the simultaneous 
coupling of mixtures of activated amino acids to a single 
resin bound amino acid or peptide [29b,30d,30e,34]. 
This approach was, for instance, successfully applied in 
the generation of peptide libraries for the subsite 
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mapping of 3C proteinase from Hepatitis A Virus [35], 
opioid compounds, Trypsin inhibitors, and other 
ligands [36]. The major drawback of this approach is the 
difficulty in controlling the final product distribution as a 
result of differences in reactivity of the activated amino 
acids. To overcome this limitation, predetermined 
relative concentrations of each activated amino acid 
was used in order to compensate for differences in 
reactivity between the building blocks [32,34e]. 

Automation of Parallel and Combinatorial 
Chemical Synthesis Methodologies 

The explosion of the combinatorial chemistry field, 
especially using peptide chemistry, has prompted 
several research groups in academia and industry to 
design robots for multiple parallel or split synthesis of 
large libraries of peptides, peptide mimetics and other 
non-oligomeric small organic molecules 
[1o,17b,34a,37]. Several automated synthesizers are 
presently commercially available from Bohdan 
Automation Inc., Advanced ChemTech, Tecan U.S. 
Inc., Zymark Corp. and Argonaut Technologies Inc. 
Many other automatic synthesizers are presently being 
developed for in-house use by several combinatorial 
chemistry companies and may, in the near future, 
become available on the market. 



Components of the Library 

As a result of the successful developments in solid 
phase peptide [8] and DNA [38] synthesis, most of the 
early reports on combinatorial chemistry dealt with 
molecules having peptidic or nucleic acid backbones. 
The advantages of this approach are the high average 
stepwise yield, the absence of any purification 
procedures, and the possibility of automation and 
robotization. Three components are involved in the 
preparation of a chemical library, i) the carrier or the 
support on which the molecules are built, ii) the 
building blocks and the chemistry associated with 
them, and iii) the linkers used to attach the building 
blocks to the support. 

The Support 

Soluble and insoluble polymeric supports 

The support must be mechanically and chemically 
stable to the solvents and reagents involved in the 
synthesis. It should be compatible with the milieu 
where the assay is performed in case the compounds 
are left on the resin. In this case, the support should be 
chosen in order to minimize non-specific interactions 
and optimize the desired interaction with the tethered 
compound. Most of the combinatorial libraries reported 
were prepared on polystyrene cross-linked with 
divinylbenzene (1-2%) and a copolymer polystyrene 



and polyethyleneglycol (PS-PEG). The first has the 
advantage of having higher loading levels, being 
mechanically more stable, and being the least 
expensive on the market. PS-PEG copolymer has 
much better swelling properties in polar solvents, its 
hydrophilicity is compatible with macromolecular 
receptors used in biological assays, and has better 
chemical reactivity as a result of not being cross-linked. 
The first synthetic library was built on polyacrylate- 
grafted polyethylene rods (pins) [16], to which 
activated amino acid monomers were oligomerized 
using standard solid phase peptide synthesis. The pins 
were organized in a 12 x 8 format in order to match the 
wells of a 96-ELISA microtiter plate. Functionalized 
silica chips were also used in combination with 
photolithogaphy technology to construct large arrays of 
peptides and oligonucleotides on microscopic 
locations (50 |am x 50 (am). Using laser confocal 
fluorescence techniques, specific interactions with 
fluorescently labelled molecules could be detected 
and identified according to their location (vide infra) 
[18,26]. Cellulose paper discs [39], glass, membranes 
[39a, 40], cotton fragments [40b, 41], polystyrene- 
grafted polyethylene film matrices [42] 
polydimethylacrylamide [12,43] based resins were 
used successfully in numerous studies. PEGA, bis 2- 
acrylamidoprop-1-yl polyethyleneglycol crosslinked 
dimethylacrylamide is a newly introduced hydrophilic, 
bio-compatible and highly swelling matrix in a wide 
range of solvents, and can be used as an alternative to 
polystyrene based matrices [44]. A new class of PS- 
PEG based matrices was designed so that the interior is 
differentiated from the exterior of the bead. This 
technique allows the introduction of two distinct 
molecules on the same bead [45]. 

The number of beads per gram of resin limits the 
size of a library when using a split synthesis scheme. 
For instance a 90 urn diameter PS-PEG bead (Rapp 
Polymere, Millipore) contains ~10 6 beads/g, which 
means that with one gram of resin a library of more than 
10 6 members cannot be envisioned. In practice, in 
order to make sure that each member is represented, 
several copies of a bead carrying the same ligand are 
necessary. This decreases the theoretical number of 
diverse molecules that can be synthesized on 1 g of 
resin by at least one order of magnitude depending on 
the size of the library and the type of assay. Moreover, a 
10 12 member library would involve 1000 kg of resin! 
Alternatives to this limitation involve the use of smaller 
beads (10 urn diameter [46]), or one bead-one motif 
described earlier. 1 bead contains -100 pmoles of 
peptide which is usually sufficient for subsequent 
biological assay and analysis. 

Soluble polymeric supports such as 
polyethyleneglycol (PEG) were introduced recently in 
combinatorial chemistry [47]. They have been used 
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since the mid 1960's as an alternative to insoluble 
polymeric supports [48] for peptide [49], 
oligonucleotide [50], and carbohydrate synthesis [51]. 
They present the major advantage of possessing 
solution phase reactivity and a high propensity for 
crystallisation that allows the polymer bound 
compounds to be purified by a simple precipitation- 
filtration process. The polymer bound compound can 
also be analyzed by NMR. 

Without support 

High yield reactions in solution are amenable to 
combinatorialization and automation [52] since in this 
case purification is not required. Polymer bound 
reagents were successfully applied to one pot 
multistep syntheses [53]. The Ugi reaction may be 
considered the prototype for non-supported solution 
phase combinatorial synthesis. It involves a one pot 
synthesis of amino acid derivatives from an isocyanide, 
aldehyde, amine, and a carboxylic acid [1q,54]. 



Libraries of esters, amides [55], carbamates [56], 
amides on a template (scaffold) [57], di- and 
trisaccharides [58,83a] libraries were prepared as 
mixtures in solution and were screened for biological 
activity using a pooling strategy (vide infra). 

The Linkers 



The chemical nature and length of the spacer 
between the matrix and the ligand is of the highest 
importance in solid phase synthesis. Like the solid 
support, it has to be chemically inert and compatible 
with the synthetic scheme. It can dramatically influence 
the binding and accessibility of the immobilized ligand 
to a soluble macromolecular receptor that may be used 
in subsequent biological assays. A universal linker 
does not exist since it is the synthetic sequence that 
determines which is appropriate. The linker may be 
considered to be a permanent protecting group or a 
protecting group that is introduced at the beginning of 



Table 4. Some of the Linkers Commonly Used in Solid Phase Synthesis (Fig. (7)). This Table Lists the 
Cleavage Conditions for Each Linker and the Functional Group That Can Be Attached. 



Linker 


Cleavaae Conditions and Use 


Ref. 


1- Wang linker. R =H 


95% TFA. For carboxylic acids 


[60] 


2- SASRIN linker, R = OMe 


1% TFA. For carboxylic acids 


[60] 


3- Rink acid linker, R = OH 


AcOH/DCM. For carboxylic acids 


[61] 


4- Rink amide linker, R = NH 2 


TFA/DCM. For carboxylic acids 


[61] 


5- Ellman Linker 


95% TFADCM. For alcohols 


[62] 


6- Silicon based linker 1 


TBAF. For carboxylic acids 


63] 


7- Silicon based linker 2 


TBAF. For carboxylic acids 


[64] 


8- Oxime linker 


Hydrazine. For carboxylic acids 


[65] 


9- Chlorotrityl linker, R = H 


AcOH. For nucleophiles 


[66] 


10- 2-Chlorotrityl linker, R = CI 


AcOH/DCM 25%. For nucleophiles 


[66] 


11- BHA linker 


CF3SO3H. For carboxylic acids 


[67] 


12- PAM linker 


HF, CF3SO3H, TBAF. For carboxylic acids. 100 times more stable than Merrifield linker 


[60b] 


13- Base labile linker 1 


DBU/Piperidine. For carboxylic acids 


[68] 


14- Base labile linker 2 


NaOH. For alcohols and amines 


[69] 


15- Base labile linker 3 


NaOH. For carboxylic acids 


[70] 


16- HYCRAM linker 


Pd°/H2. For carboxylic acids 


[71] 


17- Sieber amide linker 


TFA/DCM 1%. For carboxylic acids 


[72] 


18- Safety catch linker 


CH2N2 followed by a base or nucleophile or alkylation with ICH 2 CN followed by nucleophile. 
For carboxylic acids 


[1Q.73] 


1 9- SCAL linker 


(EtO)2PS2H/TFA. For carboxylic acids 


[74] 


20- Photolabile linker 1 


irradiation at 000 nm. ror caruoxync acias 


[75] 


21- Photolabile linker 2 


Irradiation at 350 nm. For carboxylic acids 


[75] 


22- Photolabile linker 3 


Irradiation at 350 nm. For carboxylic acids 


[76] 


23- Photolabile linker 4 


Irradiation at 365 nm. For carboxylic acids 


[77] 
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the synthesis and removed at the end without affecting versatile spacers were recently reported in the literature 

the final product. Many spacers are commercially which may be more compatible with other chemistries, 

available but in most cases they were designed for Some of them are grouped in Fig. (7) and Table (4). 
peptide and oligonucleotide synthesis [1r,59]. Several 
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Fig. (7). Some of the linkers used in solid phase synthesis. 
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Photocleavable linkers were introduced because 
they allow mild and controlled liberation of the 
compounds for biological or analytical studies. 
Heteromultifunctional spacers that allow partial release 
of a ligand from the bead for solution screening [78] or 
reversal of the orientation of a peptide on the resin 
bead so that the C-terminus is displayed have also 
been developed [21, 78a, 79]. Silyl linkages for 
attaching aromatic compounds that do not leave any 
trace of the linker in the synthesized compounds after 
photodesilylation (HF cleavage) were also reported 
[80]. 

Building Blocks 

Amino acids, nucleotides and carbohydrates 

Biopolymers such as peptides [1o,81] and 
oligonucleotides [82] were chosen in combinatorial 
chemistry for their synthetic accessibility and their 



demonstrated pharmacological properties. 
Furthermore, highly sensitive bioanalytical methods 
such as Edman degradation, Maxam and Gilbert 
microsequencing, mass spectroscopy, and the PCR, 
are available for the detection and identification of the 
active molecule. It is unfortunate however, that these 
compounds have only a poor oral absorption and 
metabolic stability. Nevertheless, peptide libraries may 
provide structure activity relationships on which to base 
subsequent peptide mimetic library design. 
Carbohydrates and glycopeptides are the only classes 
of natural oligomers that have not seen wide application 
in combinatorial chemistry. The reason behind this fact 
is simply due to their complexity and low synthetic 
accessibility which make them difficult targets for 
automated synthesis. Nevertheless, research efforts 
are ongoing in several laboratories, and the first 
successful carbohydrate combinatorial libraries have 
been recently reported [51a,83]. 
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Fig. (8). Some natural and unnatural oligomers used in the generation of chemical libraries. 
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Unnatural oligomers 

The development of alternative backbones to the 
naturally occurring ones may be considered the 
second major step in library design. After the 
establishment of combinatorial chemistry as a viable 
approach to drug discovery using biopolymers 
(peptides, oligonucleotides), researchers started 
addressing the question of bioavailability, chemical 
stability and synthetic accessibility of their libraries. A 
very successful approach to this problem was the 
design of peptide and DNA mimetics that combine both 
chemical stability and bioavailability with the inherent 
synthetic accessibility of oligomeric molecules allowing 
use of automated synthesizers. 

Peptides with unusual amino-acids, modified 
peptide bonds, linker or spacer molecules, and peptide 
mimetics [81b,84] such as biopolymers with repeating 
urea units [85], oligocarbamates [86], oligosulfoxides, 
oligosulfones [87], vinylogous polypeptides [88], 
peptoids [89], vinylogous sulfonyl pepides [90], 
pyrrolin-4-one peptidomimetic [91], hydrazinopeptides 
[92], azole peptide mimetics [93], and many others 
were reported. Some of these unnatural oligomers are 
shown in Fig. (8). Peptoid libraries yielded the first 
potent unnatural oligomeric ligand to pharmaceutical^ 
relevant receptors (7-transmembrane G-protein- 
coupled receptors) that bind in the nanomolar range 
[94]. 

Small molecules prepared on a 
scaffold/template 

While most of the early efforts in the construction of 
chemical libraries were directed towards oligomeric 
natural molecules based on pepides and nucleic acids, 
the usual drug candidate is a compact multifunctional 
small molecule with a molecular weight comprised 
between 200 and 600 [3a]. The translation from 
oligomeric into non-oligomeric small molecule drug 
candidate remains an extremely difficult task. The 
solution to this problem, proposed and tested by 
several laboratories, consists of the design of small 
non-oligomeric libraries of molecules. Privileged 
structures [95], possessing a generic scaffold found in 
a number of potent therapeutic agents, were often 
used in this design. Since the chemistry associated 
with this type of library differs from that developed for 
oligomeric molecules, research efforts are now being 
directed towards the exploration of several aspects of 
synthetic organic chemistry on solid supports. The 
challenge in the design of combinatorial libraries of 
small molecules lies in the adaptation of solution phase 
synthetic organic chemistry to synthesis on polymeric 
supports and the screening and identification of the 
desired compound. For small molecule libraries, the 
chemistry has to be optimized for a few individual 



representative molecules before the construction of 
the library. Unlike biopolymer chemistry, each time the 
template structure is changed, a new synthetic 
methodology has to be developed. Moreover, the 
synthesis needs to be short because each step has to 
be quantitative and generalizable to different starting 
building blocks. Therefore, this method might be best 
exploited for the optimization of a structurally similar 
class of compounds at an advanced stage in the drug 
discovery process rather than in the identification of 
novel lead. The screening issue was solved using 
direct and indirect techniques that will be discussed in 
the next section. 

A fine balance between ligand rigidity and flexibility, 
high density of functional groups, exhaustive coverage 
of the conformational space and the universe of 
diversity, shapes, functional group distribution and 
electrostatic surfaces are the criteria for the design of 
small molecule libraries. For this reason the following 
scaffold/templates with various topologies have been 
studied (Fig. (9)): Kemp's triacids [96], thiazolidines 
[97], all-cis cyclopentanes [98], xanthenes, cubanes, 
benzene triacids [57a-57d], 1,4 benzodiazepin-2- 
ones, hydantoins [17a, 19a, 99], 1,4 benzodiazepin- 
2,5-diones [100], diketopiperazines [1s, 101], 
isoquinolinones [102], 1 ,4-dihydropyridines [103], 
dihydro and tetrahydro isoquinolines [104], 
pyrrolidines [105], thiazolidine-4-carboxylic acids [97], 
thiazolidinones [106], imidazoles [1q], and /J-turn 
templates [107]. Libraries incorporating isosters that 
mimic the tetrahedral intermediate for peptide 
hydrolysis may also be added to this category since 
they are built around a pharmacophore or a specific 
framework. Examples include statine [108], 
hydroxyethylamine [109], hydroxyethylurea [62,110], 
diamino diol core, diamino alcohol core [111], and 
peptidyl phosphonate [112,113]. Many other small 
molecule libraries as well as new synthetic 
methodologies on polymeric support associated with 
them were reported [1q,1r,31,73b,114]. 

Finally it is noteworthy that although synthesis of 
small molecules on solid supports goes back to the 
early 70's [59b, 115], it has not seen such strong 
enthusiasm until the advent of combinatorial chemistry 
in the late 80's. 



Screening Methodologies: Identifica- 
tion of the Target Molecule 

This section describes the general methodologies 
developed in order to identify one or more active 
molecules from a large library of compounds without 
having to individually screen every member of the 
library. The actual assays (binding, catalytic, inhibition, 
competition, whole cell, chromogenic, colorimetric, 
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physical, etc.) that demonstrate the activity of a given The appropriate screening approach is the key 

compound are numerous and sometimes artistically element in the drug discovery process and its success 
designed. These techniques will not be discussed relies entirely on the sensitivity and specificity of the 
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assay and screening technique. The screening 
process is very aleatory but less stringent than a pure 
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Fig. (9). Scaffolds/templates used for the generation of small molecule libraries. 
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mathematical model in that the most potent candidate 
may not be discovered after the first screen. A lower 
affinity ligand may be selected and used as a lead in a 
second round of selection from a smaller targeted 
library. The screening can be performed on tethered or 
soluble libraries and physically segregated ligands or 
mixed pools. This defines the assay method to be used 
and ultimately determines the size of the libraries to be 
screened. The compound should be present in 
solution or on the matrix in amounts depending on the 
intended affinity. High affinity receptors require very 
small concentrations of the ligand (nM range). The large 
size of peptide and oligonucleotide libraries that can be 
readily obtained (up to 10 15 compounds) does not 
allow screening of individual compounds and for this 
reason, pooling strategies were developed in which 
sublibraries are screened for the desired activity. For 
the screening of mixtures of molecules, there is an 
optimal library size. Large mixtures (>10 6 ) are less 
successful than the small ones (<10 3 ) because 
screening small compound libraries or sublibraries 
presents several advantages: i) Each member is 
present in sufficient amounts, ii) Artifacts resulting from 
non specific interactions with low affinity ligands are 
minimized, iii) The quality of the library is better 
assessed, iv) Quantitative analysis of the biological 
assay, v) Active components are more easily identified. 

Non-Coded Methodologies 

Several methods for identifying an active 
component in a soluble or support bound mixture of 
compounds are based on a deconvolutive approach. 
Libraries are divided into sublibraries, which are 
themselves divided into other sublibraries; the library 
can then be screened for activity from the bottom or the 
top of the tree . 

Positional Libraries [2,34c, 116] 

Dual defined iterative scanning 

For instance, a library containing all possible 
hexapeptides from a set of 20 amino-acids (20 6 = 64 x 
10 6 ) was divided into 400 hexapeptide sublibraries 
where the first (or last) two amino acids are fixed to one 
of the 400 possible dipeptides from the same set of 20 
amino acids (20 2 ). The next building blocks are 
randomly introduced using split synthesis or a mixture 
of amino acids [29b,30d,30e]. Thus, each sublibrary 
contains 160000 (20 4 ) hexapeptide members where 
the first (or last) two positions are known. Each one of 
the 400 mixtures is tested for activity and the most 
highly potent is selected for a second round of 
sublibrary synthesis and screening. This time the first 
and second positions are fixed, the third is assigned 



one of the 20 possible amino acids, and the three last 
positions are randomized. Thus, 20 sublibraries 
containing 8000 hexapeptides (20 3 ) are constructed. 
The procedure is repeated in a iterative fashion until 
the most active member is identified. Fig. (10) 
illustrates this method for a tetrapeptide library using a 
set of 3 amino acids and following a split synthesis 
scheme in combination with the Tea Bag method. In 
the case of a hexapeptide library for instance, the initial 
screen would involve sublibraries of 160000 peptides 
each which may require a highly sensitive and specific 
assay. This fact limits this methodology, especially 
when libraries of longer peptides using larger sets of 
building blocks are to be screened. The advantage of 
this approach is that the method is not limited to natural 
amino acids but can be applied to other building blocks. 

Although this method can conveniently produce 
soluble peptides in sufficient quantities for several 
available assays, it requires considerable synthetic 
effort and time compared to the pin methodology 
described in the preceding section. On the other hand, 
the pin methodology, which produces immobilized 
peptide libraries on solid support, is limited by 
accessibility problems into the matrix, 
microenvironment effects, ligand orientation and 
conformation on the matrix, and assay procedure. 

Positional scanning 

This methodology consists of preparing as many 
libraries as there are positions in a targeted sequence 
(e.g. 6 for a hexapeptide). Each one of these libraries is 
divided into as many sublibraries as there are building 
blocks in the set of amino acids used for 
combinatorialization. For instance, in order to identify 
the most potent hexapeptide in a library of all possible 
hexapeptides prepared from a set of 20 amino acids 
(20 6 = 64,000,000 possible combinations), 6 different 
libraries will be designed. Each library is divided into 20 
sublibraries where one position is fixed each time to 
one building block of the set and the other positions 
are randomized. Overall, 120 (6 x 20) sublibraries 
containing each 3.2 x 10 6 (20 5 ) compounds will be 
synthesized (Fig. (11)). The screening of the 120 
peptide mixtures for a given activity will identify the best 
amino acid at each position. The combination of these 
amino acids into one sequence should generate the 
most potent component of the library. This approach is 
not endowed with a progressive improvement of the 
signal-to-noise ratio that is observed with the dual- 
scanning methodology through the iterative synthesis 
of smaller libraries. While this approach can rapidly 
identify the active component in a library, this would 
only be the case if each position in the sequence is 
independent from the other positions [34b]. 
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Fig. (10). Dual defined iterative scanning of a tetrapeptide library prepared from a set of 3 amino acid building blocks. The two 
first positions were introduced using split synthesis, and the following positions were introduced using multiple parallel peptide 
synthesis with the Tea bag technique. 



These scanning strategies were used in 
combination with the Tea Bag method and the split 
synthesis, in libraries of up to 10 12 members, for the 
discovery of Chymotrypsin [117], Trypsin [118], HIV 
Integrase [119], and HIV protease [108] inhibitors, 
antimicrobial peptides [120], epitope mapping 
[14, 15, 32a, 121], protein conformational mapping 
[122], opioid compounds [34d,123], structure activity 
relationship studies [124], and for the discovery of 



other drugs [36,125]. The same approach was 
successfully applied to oligonucleotide libraries [126]. 

One Bead-One Peptide Technology 

This approach is based on the split synthesis 
described in the previous section. The peptides are left 
on the support and are screened with a soluble 
macromolecular receptor bearing a reporter molecule 



Copyrighted material 



360 Current Medicinal Chemistry, 1996, Vol. 3, No. 5 



Hicham Fenniri 



-X 2 -X 3 -X4-X 5 -X 6 
Xj -AjO^-X^Xj-Xg 
X , -X 2 -A3-X4-X 5 -X 6 

X, -X2-X 3 -A4-X5-Xg 

X , -X2-X3-X4- Aj-Xg 

X y -Xg-XgOQ-Xfl- A| 

V 




^1.1 "X2*X3-X 4 -X 5 -Xg 
^1 .2"X2"X3-X 4 -X 5 -Xg 
^1 .3 - X2-X 3 -X 4 -X 5 -X 6 



> 20 Sublibraries 



Al.20"^2"X3-X 4 -X 6 -X 6 



V 

6 x 20 Sublibraries 

Fig. (11). Positional scanning. To identify the active 
component of a hexapeptide library, 6 libraries are prepared 
where one position is fixed. Each one of these libraries is divided 
into as many sublibraries as there are building blocks allowing 
thus every possible building block in every position an 
opportunity to identify itself during the screening of the 120 
sublibraries as an active building block at a given position. 

(fluorescent, radioactive, or another type of tag) that 
allows the identification of the bead that carries the 
desired compound [127]. The bead is isolated and 



submitted to mass spectrometry analysis, Edman 
microsequencing, nuclear magnetic resonance (NMR), 
infrared (IR) spectroscopy, or other analytical methods 
that allow the direct identification of the compound 
linked to the resin (vide infra). The compound can be 
also partially released from the resin for assay in 
solution before determination of its structure [128]. 
This approach was successfully applied in a wide range 
of studies including, epitope mapping, anticancer drug 
discovery [129], new ligands discovery for protein G, 
MHC class I molecules, platelet-derived gpllb/lla 
receptor, SH3 domain of phosphatidylinositol 3-kinase, 
Src-family protein tyrosine kinases [130] thrombin, 
factor Xa, cytokinine receptors, streptavidin, avidin, and 
artificial receptors [128b]. 

Recursive Deconvolution [131] 

This methodology differs from the standard split 
synthesis by the fact that after each coupling step and 
before mixing, a portion of the resin is saved from each 
sublibrary, as shown in Fig. (12). At the end of the split 
synthesis, each pool containing a subgroup of the 
library is tested separately in an appropriate assay. The 



Divide 



Couple and save 



Mix and divide 



Couple and save 



Mix and divide 



Couple and test 




<x 

' Save 1/3 

Q— <JB — p <l A l 



(X 



. -, Save 1/3 
CXH - Pi(B] 



C^JxTaI 



Save 1/3 



Save 1/3 



E J 

Q-<*ME\ 

Pool I, 9 compounds 



(X 

0 I 



_ Save 1/3 

Q— <TJ p.ic] 




p 2 [B] (XxTc] 



Save 1/3 



P 2 [C] 




Pool II, 9 compounds 
(Most active pool) 



Q-<rxTxi 

Q-«xIxTJ 
Pool III, 9 compounds 



Fig. (12). Recursive deconvolution strategy. The principle is the same as the split synthesis, except that after each coupling 
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saved after coupling with X at the ilh step. 
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pool that gives the best response identifies the best 
building block at the last position introduced in the split 
synthesis. For instance, the example of Fig. (12) where 
pool II gives the highest response identifies B as the 
best building block at the third position or third step of 
the synthesis. The saved portions P2[A], P2[B] and 
P2[C] will be coupled separately with B leading thus to 
three sublibraries where the second and third positions 
are known. These sublibraries are submitted to the 
same assay as the first one, and the selection of the 
sublibrary that gives the best response will identify the 
best second position. The whole procedure is 
repeated once again in order to identify the best first 
position. This approach presents several analogies with 
the dual scanning strategy described above. 

Encoded Methodologies [1n,132] 

Encoded combinatorial chemistry emerged as a 
result of a reorientation of the research efforts towards 
small molecule combinatorial libraries of non peptidic or 
non nucleic acid nature which cannot be identified 
using the standard procedures (Edman degradation, 
Sanger dideoxy sequencing, mass spectrometry, 
NMR, etc.). The reasons for this shift have been 
discussed in the previous section. Moreover the 



standard techniques necessitate a certain amount of 
material (> 1pmol) that is not always met in large 
libraries. Therefore, highly sensitive indirect methods 
allowing the identification of an active component even 
in small quantity are crucial for the development of 
combinatorial chemistry. 

DNA Encoded Libraries 

The principle of genetically coded libraries was 
proposed [133] and demonstrated [134]. The 
synthetic scheme is based on the split synthesis 
method. A heterotrifunctional linker is attached to a 
support and the two remaining functions are used to 
synthesize a peptide and an encoding oligonucleotide 
sequence. For each amino acid that was introduced, a 
three letter genetic code is orthogonally attached that 
identifies the amino acid, leading thus to specific 
encoding DNA for each peptide on each bead. The 
encoding sequence is flanked on each side with 
primers that allow PCR amplification and identification of 
the active ligands. The advantage of this method is that 
it can be applied to peptidic and non-peptidic ligands, 
as long as the synthetic methodology involved in their 
preparation is compatible with DNA chemistry, Fig. 
(13). 
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Another approach using oligonucleotides as tags 
was introduced. It differs from the previously described 
oligonucleotide encoded library by the fact that the 
synthesis of the ligand and the oligonucleotide is no 
longer orthogonal. Before the introduction of the first 
building block, the matrix is derivatized with two 
heterobifunctional spacers in a ratio favoring a 
maximum incorporation of the one that is used for 
ligand synthesis. The other spacer, present in low 
copies on the resin, is used to synthesize an 



Hicham Fenniri 

oligonucleotide that encodes the order and type of 
every chemical step of the combinatorial synthesis. 
After the assay, PCR is performed on the selected 
beads, followed by the sequencing and identification 
of the chemical nature of the active ligand [46]. 

A method that uses peptides as molecular tags for 
unnatural ligands and Edman degradation for the read- 
out of the coded information was also reported 
[19b, 20, 135]. 
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Fig. (14). Polyhalobenzene encoded library. General scheme for a permutational split synthesis using a set of 3 building 
blocks (A, B, C) and their tags (T1, T2, T3, T4). The solid spheres represent the matrix to which the building blocks (squares) 
and the tags (T1-T4) are introduced. The structures of the tags are shown at the bottom of the figure. After photocleavage, an 
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characteristic gas chromatogram profile. The 4 digits binary code numbers next to each bead are a record of the chemical 
history of each bead which can be read directly on the chromatogram. 
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Polyhalobenzene Encoded Libraries 

A similar approach to the DNA encoded 
combinatorial approach known as the "Bar Code 
methodology" makes use of small, chemically inert 
molecular tags that are attached to the resin in a non- 
sequential fashion. This approach is more 
advantageous in that it does not involve orthogonal 
and consecutive attachment of the tags which 
simplifies the synthetic effort, and is compatible with a 
wide range of chemical reactions. 

This tagging approach [114d,136] uses a binary 
code (0,1) that states if the tag molecule is present (1) 
or absent (0). With 40 such tags for instance, one can 
uniquely encode 2 40 - 10 12 unique chemical events. 
The tag encodes both the step number and the 
chemical reagent used in that step so that the array of 
tags used forms a binary record of the synthetic steps 
for each bead. Each one of the digits corresponds in 
practice to a peak on the chromatogram with a defined 
retention time, n tag molecules can be used to encode 
(2 n -1) starting compounds per reaction step. In an N- 
step reaction sequence, combinatorial principles 
indicate that (2 n - 1) N compounds can be encoded with 
(n X N) tags. For instance, a directional permutational 
library that is constructed following a split synthesis 
scheme with a set of 3 building blocks (X1 , X2, X3) and 
2 chemical steps to generate compounds of the 
generic formula A1 A2 where A1 and A2 can be X1 , X2, 
or X3, will involve 6 (2 X 3) chemical steps and will host 
9 (3 2 ) members. In this case 2 tag molecules are used 
to encode 3 (2 2 -1) starting compounds per reaction 
step. Hence, a 2 step reaction sequence to generate 
all 9 ((2 2 - 1) 2 = 3 2 ) possible dimers A1A2 will be 
encoded by 4 molecular tags (2 x 2). Fig. (14) 
illustrates this methodology. 
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In practice, if for instance beads 1 and 9 carry a 
biologically active molecule, they are selected, the tags 
photocleaved and analyzed by ECGC (electron capture 
gas chromatography). The chromatogram would show 
the profiles in Fig. (15). Bead 1 would be assigned the 
bar code (0101) indicating that tags T1 and T3 are 
present on the bead, and bead 9 would be assigned 
the bar code (1111) indicating that tags T1 , T2, T3 and 
T4 are present on the bead. One drawback of this 
method is that the assay is conducted on the support 
bound compounds which also carry a set of tags. Non 
specific interaction and microenvironment effects 
affecting both the ligand and the receptor conformation 
may have an influence on the recognition process. A 
very similar approach that uses amino acid derivatives 
as the tag molecules and HPLC (high performance 
liquid chromatography) for the read-out of the code was 
also reported [128b]. 

Radiofrequency Combinatorial Chemistry 

This is the most recent development in encoded 
combinatorial chemistry [137]. It involves 
radiofrequency signals and semiconductor memory 
devices using a multifunctional microreactor (Fig. (16)). 
A memory chip capable of receiving, storing and 
emitting radiofrequency signals and a solid support are 
put together in an inert, removable porous capsule. A 
split synthesis scheme using a large pool of these 
microreactors is followed to generate a peptide library. 
The main difference is that in this case, each capsule 
contains one peptidic sequence attached to the resin 
and a memory chip containing all the history of that 
particular sequence (building blocks used, pH, 
temperature, and other parameters for each step of the 
synthesis). When the synthesis is complete, the 
compounds are submitted for biological assay, and the 
structure of the most potent one is read directly from 
the corresponding memory chip. This microreactor will 
soon be commercially available. 
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Fig. (16). Schematic representation of a microreactor 
capable of storing the chemical history of its content. 




Retention time 

Fig. (15). Expected gas chromatograms after 
photocleavage of the tags from bead 1 and 9 (Fig. (14)). 
Each peak corresponds to a particular tag. The presence of a 
peak is given the digit 1 and its absence the digit 0. 



Direct Screening Techniques 

Several methods using mass spectrometry 
[57b,77b,77c,96,138], NMR [139], fluorescence 
correlation spectroscopy (FCS) [140], fluorescence cell 
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sorter [46], fluorescence microscopy [141], or IR 
spectroscopy [142], were also developed that allow the 
direct identification of the bound ligand or its encoding 
sequence. 

A recently reported methodology based on TOF- 
MALDI-MS (time of flight matrix assisted laser 
desorption ionization mass spectrometry) is 
summarized in Fig. (17) [138c,138h]. The standard 
split synthesis scheme is followed, except that in this 
case at each coupling step with the building blocks, 
10% of a capping molecule is added in the reaction 
mixture. This way, at each step the full length peptide is 
produced, accompanied by a family of termination 
products that can be characteristic of the building block 
and the corresponding chemical step. Only 2-5% of the 
peptide derived from a single bead is sufficient to 
readily identify the sequence of the peptide when 
using this technique. This method is similar to MALDI 
ladder sequencing methods previously reported [143] 
that use controlled Edman degradation of isolated full 
length peptides, but differs in the manner used to 
generate the sequence specific ladders. 
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Fig. (17). TOF-MALDI-MS technique for the identification of 
a peptide after release from a single resin bead. 

A more powerful, sensitive, and somewhat more 
practical method because it does not require additional 
chemical couplings other than the library synthesis was 
also reported. It is based on TOF-SIMS (time of flight 
secondary ion mass spectrometry) and it analyzes the 
compounds while still adsorbed to the resin bead 
[138d]. 

Screening of Oligo(Ribo)Nucleotide Libraries 
[144] 

Combinatorial libraries of nucleic acid based 
compounds possess the tremendous advantage of 
being amplifiable/enrichable using polymerase chain 
reaction (PCR) technology. In effect, the screening of a 
library of nucleic acid based compounds against an 



immobilized receptor allows for the selection of a small 
set of molecules that interact strongly with the receptor. 
Elution, amplification (PCR), and reselection allow, after 
few cycles, to identify the best binders. This strategy 
was applied for the selection of a novel thrombin 
inhibitor [145], irreversible inhibitors of human 
neutrophil elastase [146], HIV1 reverse transcriptase 
[147] and RNase H [148] inhibitors. 



Biologically Generated Libraries 

The biological approaches for the generation of 
molecular diversity are permutational by nature 
because they use one set of molecules (amino acids 
and/or nucleotides). Most of these methods derive 
their proficiency from the amplifying power of phage 
particles and bacteria, or from PCR technology. The 
powerful technique of phage display is based on the 
construction of libraries of peptides or proteins as 
fusion products with proteins expresssed on the 
surface of the phage particle [149]. This display 
process allows not only the selection of the peptide or 
protein with the desired biological activity, but also the 
encoding genetic material that is packaged inside the 
phage particle. The selection step, called panning 
[150], is followed by the amplification of the selected 
particle and the identification and/or isolation of the 
displayed molecule. E. Coli [151] and other bacteria 
[152], as well as plasmids [153] and polysomes 
[144d,154] were also used as display systems. The 
latter system displays peptides in a cell free format 
which increases the size of the library by up to 10 6 fold 
campared to microorganism based display systems. 
Other approachs such as phage or colony-lift 
techniques [155] were also reported. 

The essence of antibody catalysis is the creation of 
a library of molecules from which those with unique 
chemical potential are selected. One of the approaches 
to the generation of these tailored catalysts is to 
challenge the immune sytem with an antigen that 
resembles the transition state of a given reaction. 
Through the combinatorial association of variable (V), 
joining (J), and diversity (D) genes, the immune system 
generates a tremendous number of antibodies (10 6 - 
10 12 ) against the antigen. Through rapid screening and 
affinity maturation of a small subset of this library, the 
immune system produces highly specific antibodies 
that may catalyze the reaction involving the transition 
state for which they were raised [6]. 

A somewhat similar approach using synthetic 
oligonucleotide libraries led to the selection of RNA 
and DNA molecules exhibiting a very high specificity 
and selectivity for a given substrate. For instance, DNA 
or RNA that binds strongly to adenosine, ATP [156], 
flavin, nicotinamide cofactors [157], amino acids [158], 
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proteins [159], aminoglycoside antibiotics [160], and 
other molecules [161] were selected from RNA or DNA 
libraries containing up to 10 15 different sequences. 



o 




Oxytetracycline 



Fig. (18). Examples of complex natural molecules obtained 
from the polyketide synthases pathway. 

Combinatorial biosynthesis of unnatural natural 
products is one of the latest developments in the field 
of biologically generated chemical libraries. The 
applicability of this new approach was recently 
demonstrated in the case of polyketides [162]. 
Polyketide synthases (PKSs) are a class of multimeric 
proteins where each subunit is a functional and/or 
catalytic protein. PKSs catalyze repeated 
stereocontrolled condensation cycles between acetyl, 
propionyl, malonyl, or methylmalonyl thioesters. Each 
cycle results in the formation of a p-keto group that may 
undergo further enzyme catalyzed reductive 
transformations and/or cyclizations, leading thus to the 
complex structure of natural products such as 
oxytetracycline, erythromycin and many more (Fig. 
(18)). The potential of this system in combinatorial 
chemistry was demonstrated through the genetic 
construction of PKS libraries that are able to synthesize 
an unlimited number of novel chemical entities with 
predetermined structures. 

Application of Combinatorial Chemis- 
try to Supramolecular Recognition 
and Catalysis 

Supramolecular Recognition 

One of the research areas that will vastly benefit 
from combinatorial chemistry is the field of molecular 



recognition and catalysis. The classical approach to 
host-guest chemistry or molecular recognition 
involved, until a few years ago, the design of a specific 
receptor for a given substrate based upon computer 
modelling, determination of the solid state structure of 
the host and/or the guest, and basic knowledge of 
non-cova!ent interactions. More recent approaches to 
this problem differ in two fundamental ways. First, 
libraries of compounds instead of one compound are 
used as substrates for a given receptor in order to 
identify the best guest. The receptor has thus become 
the constant and the substrate the variable. From the 
combinatorial chemist's point of view, this approach is 
still the standard approach to drug discovery, where the 
primary goal is to find the lead before questioning how 
it interacts with its receptor. From the supramolecular 
chemist's point of view, an understanding of the 
fundamental rules of molecular recognition using 
simplified synthetic versions of biological receptors is 
still a tremendous challenge. Combinatorial chemistry 
offers an expeditious access to an inexhaustible 
source of information concerning the molecular basis of 
host-guest interactions which may lead, in the near 
future, to the design of synthetic receptors paralleling 
antibodies in terms of complex stability and specificity. 
For example, tweezerlike receptors based on 
vinylogous sulfonyl peptides were screened against a 
library of 50,625 tripeptides in order to validate them as 
potential artificial receptors and to define their eventual 
specificity. This study has shown that, although simply 
designed, these receptors are capable of sequence 
specific recognition of peptides [163]. The specificity 
of other synthetic receptors was determined using the 
same strategy [136b, 164]. The second fundamental 
innovation in molecular recognition, which may have 
more practical applications, is the selection of the best 
host from a library of synthetic receptors for a given 
substrate without any knowledge for their mode of 
interaction. For example, libraries of peptidic receptors 
built upon a steroidal scaffold (Fig. (19)) were prepared 
and successfully screened in order to identify specific 
receptors for 5 Leu enkephalin [165]. Another library of 
synthetic ionophores based upon a peptide 
functionalized cyclen core yielded Cu(ll) and Co(ll) 
complexes that were more or less stabilized by the 
randomized peptide sequences (Fig. (19)) [166]. The 
self-organization of two functionalized terpyridine 
subunits around a Ru(ll) transition metal center was 
exploited to bring together pharmacophores or binding 
regions in close proximity as shown in Fig. (19). This 
class of receptors presents some similarities with 
antibodies, in that it possesses a constant region 
([terpyridine]2Ru complex) and a variable region 
corresponding to the functional groups carried by each 
one of the terpyridines. From a 30 member 
combinatorial library, receptors for dicarboxylates and 
diammonium salts were identified [167]. 
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Fig. (19). Synthetic receptor libraries. 

Supramolecular Catalysis 

The next and perhaps most challenging step to 
specific recognition by artificial receptors is the 
transformation of the bound substrate. Tremendous 
efforts have been invested in attempts to reproduce 
the catalytic activity of enzymes. Initial reports 
demonstrating the very modest esterolytic properties 
of cyclodextrins go back to the early 1950's [168]. This 
work was followed by an avalanche of rationally 
designed synthetic receptors based on cyclodextrins 
[169], calixarenes [170], cyclophanes [171], 
polyamines [172], crown ethers [173], and others 
[174]. Some of these receptors exhibited Michaelian 
behaviour and brought insights to fundamental 
questions in enzymology and catalysis in general. 



Unfortunately, organic artificial catalysis did not benefit 
from the same outburst that led organometallic and 
inorganic catalysis to practical applications in industry. 
The rules governing enzyme catalysis, although fairly 
understood, are still extremely difficult to combine in a 
productive way into a single chemically accessible 
synthetic receptor. Combining a few features of natural 
enzymes is an overly simplistic approximation that has 
not yet led to receptors with comparable activities. Here 
again, the combinatorial approach might provide a 
needed boost. Indeed, the first successful attempts of 
combinatorially generated artificial catalysts have been 
recently reported and may be a hopeful prelude to this 
particular and promising approach. Polymeric 
polyamine materials, also known for their esterolytic 
properties [175] were randomly functionalized yielding 
polymers which carry several functional groups in 
different ratios (Fig. (20)). In the presence of Fe(lll), 
these polymers, with a specific ratio of functional 
groups, remarkably accelerated the hydrolysis of 
activated phosphodiesters (k ca t/kuncat - 10 4 -10 5 ) 
[176]. Another report described a strategy for the 
generation of a library of chiral ligands for 
enantiospecific addition of dialkyl zinc to prochiral 
aldehydes for the generation of chiral secondary 
alcohols, enantioselective reductions of ketones, and 
asymmetric Diels Alder reactions [1 15h,1 15j,177]. 




Fig. (20). Library of polymeric catalysts prepared from 
coupling different ratios of functional groups (residues Ri) to 
a polyamine. 
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One of the major limitations that the combinatorial 
chemist must overcome in his/her search for new 
catalysts is the design of an efficient method that allows 
for the rapid screening of libraries of molecules directly 
for catalytic activity. In some cases of nucleic acid based 
libraries, this is overcome by the ability of the active 
molecules to be "evolved" and be selectively amplified 
using PCR technology, even when only a few copies of 
each member of the library are present (vide infra). In 
most of the cases however, the desired molecules are 
selected according to their affinity for the transition 
state of a reaction, for a mimic of a reactive intermediate 
in the reaction pathway, or for a blueprint of the 
presumed active site (e.g. catalytic antibodies [6] and a 
few cases of RNA/DNA based catalysts [161b,178]). A 
new methodology termed an "encoded reaction 
cassette" was recently reported that may offer a general 
solution to the screening problem [179]. The principle 
of this methodology is that it combines both solid 
phase synthesis and PCR technology and operates 
through the liberation (bond cleavage) or capture 
(bond formation) of a polynucleotide containing two 
primers (Fig. (21)) that can be amplified and decoded. 
Thus, when an appropriately functionalized solid 
support is exposed to a catalyst or a library of catalysts 
that are able to selectively cleave the reaction cassette 
at the substrate site, single stranded DNA 
(polynucleotide) will be released and can be amplified 
by the PCR. Furthermore, the sequence of the 
polynucleotide may be chosen in such a manner that it 
reflects the nature of the substrate so that a library of 



encoded substrates can be designed. In essence, 
when substrate libraries are exposed to a library of 
catalysts one can identify not only the presence of a 
catalyst, but also the catalyst's specificity as the 
sequence of the cleaved polynucleotide encodes and 
thus identifies which substrate sequence has been 
cleaved. In addition, the reaction cassette system 
allows one to follow bond formation events via the 
reverse pathway (Fig. (21)). In the bond cleavage 
detection mode, using a-Chymotrypsin as the catalyst, 
the specificity of the reaction cassette was 
demonstrated through the selective recognition and 
cleavage of peptide substrates differing in sequence 
by only one amino-acid (Ala2-Tyr-Ala2 versus Ala2~ 
Phe-Ala2), and by the detection of reactions with half- 
lives of years in a matter of hours. The sensitivity (0.1-1 
pmoles, 5-50 nM) is the same whether using 0.01 mg 
(1 bead) or 10 mg (1000 beads), but is dependent on 
the concentration of a-Chymotrypsin. The amount of 
this enzyme, however, can in principle be decreased to 
as little as 2400 molecules provided the concentration 
is kept >5 nM. In the bond formation detection mode, 
a-Chymotrypsin catalyzed peptide bond formation was 
not possible because of steric hindrance, whereas 
chemically catalyzed bond formation (reductive 
amination of an aldehyde) could be performed, and was 
used as the prototype reaction for the demonstration of 
the second part of the principle of the reaction 
cassette, namely, the detection of the formation of a 
chemical bond. Finally the reaction cassette can be 
assembled rapidly and efficiently and in combination 
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Fig. (21). Principle of the encoded reaction cassette. 
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with a fluorescent assay, it allows for the rapid 
screening of libraries of catalysts and substrates. 



Nucleic Acids Based Receptors and Catalysts 
[180] 

As discussed in the previous section, very large 
libraries (up to 10 15 members) of random nucleic acid 
sequences (RNA/DNA) can be generated for the 
selective recognition of various types of small and large 
molecules. A few members from these libraries can be 
selected or evolved so that only the ones that bind or 
perform a chemical transformation are selected and 
amplified (PCR technology). The procedure can be 
repeated several times under more or less stringent 
conditions so that only a specific subset of molecules 
with the desired properties can be identified. For 
instance, 10 14 RNA 38mers of random sequences 
were prepared and screened for the selective 
recognition of caffeine versus theophyllin. These two 
molecules differ from one another by only one methyl 
group. One of the oligonucleotides that was selected 
had an affinity 10000 times higher for caffeine. This 
specificity is 1 00 times better than that of a monoclonal 
antibody presently used to monitor blood theophyllin 
levels [161a]. Several strategies were designed so that 
catalytic RNA molecules are evolved from very large 
pools of randomized sequences [178a, 181]. Most 
interestingly, a recent report described a strategy that 
selects DNA oligonucleotides that catalyze the 
cleavage of RNA [182]. In this case, a DNA 50mer that 
accelerates a ribophosphodiester bond hydrolysis by a 
factor of 10 5 was selected from a pool of 10 14 random 
sequences. 



Conclusion and Perspectives 

The growing number of acquisitions of early-stage 
combinatorial chemistry companies by pharmaceutical 
giants (Selectide by Marrion Merell Dow; Sphinx 
Pharmaceuticals by Eli Lilly; Affymax by Glaxo- 
Wellcome, and many more) is an indication of the 
recognition acquired by this new discipline [1g,183] 

Combinatorial chemistry has successfully gone 
through a multitude of stages involved in the drug 
design process. Although no drugs discovered 
combinatorially have been approved for marketing, 
many have entered the initial phases of clinical trial. Isis 
Pharmaceutical's inhibitor of HIV envelope-mediated 
cell fusion obtained from a library of phosphorothioate 
oligonucleotides library is most likely the first 
compound discovered through a combinatorial 
approach to enter human clinical trials [184]. 
Researchers in industry and academia remain very 
optimistic about the future of combinatorial sciences, 
not only in medicinal chemistry but in every major area, 



including biotechnology, agrochemistry, materials 
science, molecular recognition and catalysis. We will 
certainly witness in the near future the 
combinatorialization of even more remote aspects of 
chemical sciences. Several elegant solutions have 
been proposed for the generation of large libraries of 
compounds and their screening for a particular activity. 
We shall expect other and more original approaches to 
these issues to come. For instance, a recent 
methodology allowing a dramatic optimization of 
combinatorial compound libraries using a genetic 
algorithm was reported [185]. This technique allows, in 
few steps, without synthesizing all possible 
compounds of a library, the identification of potential 
leads. This is done simply by testing a set of individual 
compounds selected according to genetic rules 
(replication, crossover, and mutation). 

In the past three years, combinatorial chemistry has 
been drifting more and more from oligomeric libraries to 
libraries built on a scaffold/template. This approach 
allows the generation of smaller libraries (10 2 -10 5 
members) that cover more effectively the structural, 
conformational, and physico-chemical space. This was 
also the precursor of a new concept in combinatorial 
chemistry, that is the concept of the representative 
library [23] or universal library. This type of library would 
minimize redundancies, cover the realm of structural 
and conformational space and the universe of diversity, 
shapes, functional group distribution and electrostatic 
surfaces. This idea was pushed even farther towards a 
computer generated virtual library of compounds that 
can be electronically screened against a given receptor 
with known structure, or a receptor for which only a 
SAR study is available, in order to identify a group of 
building blocks and/or a scaffold that would fit most 
appropriately with the construction of the optimal real 
library [186]. The tremendous amount of information 
that can be extracted from library screening may also be 
used as a predictive tool for SAR: It may soon become 
possible to predict the activity of a molecule simply from 
its structure [3b]. 

Ultimately, when the solution structure of 
macromolecular receptors will be easily accessible and 
their interaction with ligands predictable, and when 
organic synthesis of any target molecule is not a limiting 
factor, then combinatorial chemistry will vanish. For the 
time being, and as long as these limitations exist, this 
science will remain perhaps the most powerful tool for 
drug discovery and many other research fields. 

Abbreviations 

CADD = Computer aided drug design 

QSAR = Quantitative structure activity 

relationship 



Copyrighted material 



I 1996, Vol. 3, No. 5 



HTS 

PS-PEG 

ELISA 

PEG 



= High throughput screening 

= Polystyrene-polyethyleneglycol 

= Enzyme linked immunosorbant 
assay 

= Polyethyleneglycol 



TOF-MALDI-MS = Time of flight matrix assisted laser 
desorption ionization mass 
spectrometry 



NMR 
IR 

TOF-SIMS 

PCR 
PKS 



Nuclear magnetic resonance 
Infrared 

Time of flight secondary ion mass 
spectrometry 

Polymerase chain reaction 

Polyketide synthase 
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